Skip to content

Commit

Permalink
Merge pull request #119 from breezedeus/dev
Browse files Browse the repository at this point in the history
support new Math Formula Detector based on Ultralytics
  • Loading branch information
breezedeus authored Jun 18, 2024
2 parents a22d48c + 321b962 commit d725ae1
Show file tree
Hide file tree
Showing 28 changed files with 302 additions and 134 deletions.
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@

Major changes:

* Added layout analysis and table recognition models, supporting the conversion of images with complex layouts into Markdown format. See examples: [Pix2Text Online Documentation / Examples](https://pix2text.readthedocs.io/zh/latest/examples_en/).
* Added support for converting entire PDF files to Markdown format. See examples: [Pix2Text Online Documentation / Examples](https://pix2text.readthedocs.io/zh/latest/examples_en/).
* Added layout analysis and table recognition models, supporting the conversion of images with complex layouts into Markdown format. See examples: [Pix2Text Online Documentation / Examples](https://pix2text.readthedocs.io/zh/stable/examples_en/).
* Added support for converting entire PDF files to Markdown format. See examples: [Pix2Text Online Documentation / Examples](https://pix2text.readthedocs.io/zh/stable/examples_en/).
* Enhanced the interface with more features, including adjustments to existing interface parameters.
* Launched the [Pix2Text Online Documentation](https://pix2text.readthedocs.io).

Expand All @@ -56,7 +56,7 @@ See more at: [RELEASE.md](docs/RELEASE.md) .
- **Layout Analysis Model**: [breezedeus/pix2text-layout](https://huggingface.co/breezedeus/pix2text-layout) ([Mirror](https://hf-mirror.com/breezedeus/pix2text-layout)).
- **Table Recognition Model**: [breezedeus/pix2text-table-rec](https://huggingface.co/breezedeus/pix2text-table-rec) ([Mirror](https://hf-mirror.com/breezedeus/pix2text-table-rec)).
- **Text Recognition Engine**: Supports **80+ languages** such as **English, Simplified Chinese, Traditional Chinese, Vietnamese**, etc. For English and Simplified Chinese recognition, it uses the open-source OCR tool [CnOCR](https://github.com/breezedeus/cnocr), while for other languages, it uses the open-source OCR tool [EasyOCR](https://github.com/JaidedAI/EasyOCR).
- **Mathematical Formula Detection Model (MFD)**: Mathematical formula detection model (MFD) from [CnSTD](https://github.com/breezedeus/cnstd).
- **Mathematical Formula Detection Model (MFD)**: [breezedeus/pix2text-mfd](https://huggingface.co/breezedeus/pix2text-mfd) ([Mirror](https://hf-mirror.com/breezedeus/pix2text-mfd)). Implemented based on [CnSTD](https://github.com/breezedeus/cnstd).
- **Mathematical Formula Recognition Model (MFR)**: [breezedeus/pix2text-mfr](https://huggingface.co/breezedeus/pix2text-mfr) ([Mirror](https://hf-mirror.com/breezedeus/pix2text-mfr)).

Several models are contributed by other open-source authors, and their contributions are highly appreciated.
Expand All @@ -65,7 +65,7 @@ Several models are contributed by other open-source authors, and their contribut
<img src="docs/figs/arch-flow.jpg" alt="Pix2Text Arch Flow"/>
</div>

For detailed explanations, please refer to the [Pix2Text Online Documentation/Models](https://pix2text.readthedocs.io/zh/latest/models/).
For detailed explanations, please refer to the [Pix2Text Online Documentation/Models](https://pix2text.readthedocs.io/zh/stable/models/).

<br/>

Expand All @@ -74,12 +74,12 @@ As a Python3 toolkit, P2T may not be very user-friendly for those who are not fa
If you're interested, feel free to add the assistant as a friend by scanning the QR code and mentioning `p2t`. The assistant will regularly invite everyone to join the group where the latest developments related to P2T tools will be announced:

<div align="center">
<img src="https://pix2text.readthedocs.io/zh/latest/figs/wx-qr-code.JPG" alt="Wechat-QRCode" width="300px"/>
<img src="https://pix2text.readthedocs.io/zh/stable/figs/wx-qr-code.JPG" alt="Wechat-QRCode" width="300px"/>
</div>

The author also maintains a **Knowledge Planet** [**P2T/CnOCR/CnSTD Private Group**](https://t.zsxq.com/FEYZRJQ), where questions are answered promptly. You're welcome to join. The **knowledge planet private group** will also gradually release some private materials related to P2T/CnOCR/CnSTD, including **some unreleased models**, **discounts on purchasing premium models**, **code snippets for different application scenarios**, and answers to difficult problems encountered during use. The planet will also publish the latest research materials related to P2T/OCR/STD.

For more contact method, please refer to [Contact](https://pix2text.readthedocs.io/zh/latest/contact/).
For more contact method, please refer to [Contact](https://pix2text.readthedocs.io/zh/stable/contact/).


## List of Supported Languages
Expand Down Expand Up @@ -196,15 +196,15 @@ You can also try the **[Online Demo](https://huggingface.co/spaces/breezedeus/Pi

## Examples

See: [Pix2Text Online Documentation/Examples](https://pix2text.readthedocs.io/zh/latest/examples_en/).
See: [Pix2Text Online Documentation/Examples](https://pix2text.readthedocs.io/zh/stable/examples_en/).

## Usage

See: [Pix2Text Online Documentation/Usage](https://pix2text.readthedocs.io/zh/latest/usage/).
See: [Pix2Text Online Documentation/Usage](https://pix2text.readthedocs.io/zh/stable/usage/).

## Models

See: [Pix2Text Online Documentation/Models](https://pix2text.readthedocs.io/zh/latest/models/).
See: [Pix2Text Online Documentation/Models](https://pix2text.readthedocs.io/zh/stable/models/).

## Install

Expand All @@ -226,15 +226,15 @@ If the installation is slow, you can specify an installation source, such as usi
pip install pix2text -i https://mirrors.aliyun.com/pypi/simple
```

For more information, please refer to: [Pix2Text Online Documentation/Install](https://pix2text.readthedocs.io/zh/latest/install/).
For more information, please refer to: [Pix2Text Online Documentation/Install](https://pix2text.readthedocs.io/zh/stable/install/).

## Command Line Tool

See: [Pix2Text Online Documentation/Command Tool](https://pix2text.readthedocs.io/zh/latest/command/).
See: [Pix2Text Online Documentation/Command Tool](https://pix2text.readthedocs.io/zh/stable/command/).

## HTTP Service

See: [Pix2Text Online Documentation/Command Tool/Start Service](https://pix2text.readthedocs.io/zh/latest/command/).
See: [Pix2Text Online Documentation/Command Tool/Start Service](https://pix2text.readthedocs.io/zh/stable/command/).


## MacOS Desktop Application
Expand Down
16 changes: 14 additions & 2 deletions docs/RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,26 @@
# Release Notes

## Update 2024.06.18:**V1.1.1** Released

Major changes:

* Support the new mathematical formula detection models (MFD), which significantly improves the accuracy of formula detection.


主要变更:

* 支持新的数学公式检测模型(MFD),公式检测精度获得较大提升。


## Update 2024.06.17:**V1.1.0.7** Released

Major changes:

* adapted with cnstd==1.2.4 , thanks to https://github.com/g1y5x3 .
* adapted with cnstd>=1.2.4, thanks to [@g1y5x3](https://github.com/g1y5x3) .

主要变更:

* 适配 cnstd==1.2.4 ,感谢 https://github.com/g1y5x3
* 适配 cnstd>=1.2.4 ,感谢 [@g1y5x3](https://github.com/g1y5x3)

## Update 2024.06.04:**V1.1.0.6** Released

Expand Down
8 changes: 4 additions & 4 deletions docs/buymeacoffee.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ By supporting my projects through a donation, you can be a part of this journey

## 1. 知识星球

欢迎加入**知识星球** **[CnOCR/CnSTD私享群](https://t.zsxq.com/FEYZRJQ)****知识星球私享群**会陆续发布一些 CnOCR/CnSTD/P2T 相关的私有资料。
关于星球的更详细说明请参考[知识星球 | Breezedeus.com](https://www.breezedeus.com/article/zsxq)
欢迎加入**知识星球** **[P2T/CnOCR/CnSTD私享群](https://t.zsxq.com/FEYZRJQ)****知识星球私享群**会陆续发布一些 CnOCR/CnSTD/P2T 相关的私有资料。
关于星球会员享受福利的更详细说明请参考[知识星球 | Breezedeus.com](https://www.breezedeus.com/article/zsxq)

<figure markdown>
![知识星球二维码](https://cnocr.readthedocs.io/zh/latest/cnocr-zsxq.jpeg){: style="width:280px"}
![知识星球二维码](https://cnocr.readthedocs.io/zh/stable/cnocr-zsxq.jpeg){: style="width:280px"}
</figure>


Expand All @@ -23,7 +23,7 @@ By supporting my projects through a donation, you can be a part of this journey
Give the author a reward through Alipay.

<figure markdown>
![支付宝收款码](https://cnocr.readthedocs.io/zh/latest/cnocr-zfb.jpg){: style="width:280px"}
![支付宝收款码](https://cnocr.readthedocs.io/zh/stable/cnocr-zfb.jpg){: style="width:280px"}
</figure>


Expand Down
4 changes: 2 additions & 2 deletions docs/command.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ p2t predict -l en,ch_sim --resized-shape 768 --file-type pdf -i docs/examples/te
预测时也支持使用自定义的参数或模型。例如,使用自定义的模型进行预测:

```bash
p2t predict -l en,ch_sim --mfd-config '{"model_type": "yolov7", "model_fp": "/Users/king/.cnstd/1.2/analysis/mfd-yolov7-epoch224-20230613.pt"}' --formula-ocr-config '{"model_name":"mfr-pro","model_backend":"onnx"}' --text-ocr-config '{"rec_model_name": "doc-densenet_lite_666-gru_large"}' --rec-kwargs '{"page_numbers": [0, 1]}' --resized-shape 768 --file-type pdf -i docs/examples/test-doc.pdf -o output-md --save-debug-res output-debug
p2t predict -l en,ch_sim --mfd-config '{"model_name": "mfd-pro", "model_backend": "onnx"}' --formula-ocr-config '{"model_name":"mfr-pro","model_backend":"onnx"}' --text-ocr-config '{"rec_model_name": "doc-densenet_lite_666-gru_large"}' --rec-kwargs '{"page_numbers": [0, 1]}' --resized-shape 768 --file-type pdf -i docs/examples/test-doc.pdf -o output-md --save-debug-res output-debug
```


Expand Down Expand Up @@ -99,7 +99,7 @@ p2t serve -l en,ch_sim -H 0.0.0.0 -p 8503
服务开启时也支持使用自定义的参数或模型。例如,使用自定义的模型进行预测:

```bash
p2t serve -l en,ch_sim --mfd-config '{"model_type": "yolov7", "model_fp": "/Users/king/.cnstd/1.2/analysis/mfd-yolov7-epoch224-20230613.pt"}' --formula-ocr-config '{"model_name":"mfr-pro","model_backend":"onnx"}' --text-ocr-config '{"rec_model_name": "doc-densenet_lite_666-gru_large"}' -H 0.0.0.0 -p 8503
p2t serve -l en,ch_sim --mfd-config '{"model_name": "mfd-pro", "model_backend": "onnx"}' --formula-ocr-config '{"model_name":"mfr-pro","model_backend":"onnx"}' --text-ocr-config '{"rec_model_name": "doc-densenet_lite_666-gru_large"}' -H 0.0.0.0 -p 8503
```

### 服务调用
Expand Down
8 changes: 4 additions & 4 deletions docs/contact.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
## 一、知识星球 [**P2T/CnOCR/CnSTD私享群**](https://t.zsxq.com/FEYZRJQ)

作者维护 **知识星球** [**P2T/CnOCR/CnSTD私享群**](https://t.zsxq.com/FEYZRJQ) ,欢迎加入。**知识星球私享群**会陆续发布一些 P2T/CnOCR/CnSTD 相关的私有资料。
关于星球的更详细说明请参考[知识星球 | Breezedeus.com](https://www.breezedeus.com/article/zsxq)
关于星球会员享受福利的更详细说明请参考[知识星球 | Breezedeus.com](https://www.breezedeus.com/article/zsxq)

<figure markdown>
![知识星球二维码](https://cnocr.readthedocs.io/zh/latest/cnocr-zsxq.jpeg){: style="width:280px"}
Expand All @@ -20,14 +20,14 @@
![微信交流群](https://huggingface.co/datasets/breezedeus/cnocr-wx-qr-code/resolve/main/wx-qr-code.JPG){: style="width:270px"}
</figure>

正常情况小助手会定期邀请入群,但无法保证时间。如果期望尽快得到答复,可以加入上面的知识星球 [**CnOCR/CnSTD私享群**](https://t.zsxq.com/FEYZRJQ)
正常情况小助手会定期邀请入群,但无法保证时间。如果期望尽快得到答复,可以加入上面的知识星球 [**P2T/CnOCR/CnSTD私享群**](https://t.zsxq.com/FEYZRJQ)


## 三、Discord

欢迎加入 [**我的Discord 服务器**](https://discord.gg/GgD87WM8Tf)
欢迎加入 [**Pix2Text Discord 服务器**](https://discord.gg/GgD87WM8Tf)

Welcome to join [**my Discord Server**](https://discord.gg/GgD87WM8Tf) .
Welcome to join [**Pix2Text Discord Server**](https://discord.gg/GgD87WM8Tf) .


## 四、邮件 / Email
Expand Down
2 changes: 1 addition & 1 deletion docs/demo.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

## 在线 Demo 🤗

也可以使用 **[在线 Demo](https://huggingface.co/spaces/breezedeus/Pix2Text-Demo)**(无法科学上网可以使用 [国内 Demo](https://hf-mirror.com/spaces/breezedeus/Pix2Text-Demo)) 尝试 **P2T** 在不同语言上的效果。但在线 Demo 使用的硬件配置较低,速度会较慢。如果是简体中文或者英文图片,建议使用 **[P2T网页版](https://p2t.breezedeus.com)**
也可以使用 **[在线 Demo](https://huggingface.co/spaces/breezedeus/Pix2Text-Demo)**(无法科学上网可以使用 [国内镜像](https://hf.qhduan.com/spaces/breezedeus/Pix2Text-Demo)) 尝试 **P2T** 在不同语言上的效果。但在线 Demo 使用的硬件配置较低,速度会较慢。如果是简体中文或者英文图片,建议使用 **[P2T网页版](https://p2t.breezedeus.com)**

<figure markdown>
![在线 Demo](https://pic3.zhimg.com/80/v2-ebe8d3d955a580a297aabcd27439604e_720w.webp)
Expand Down
Loading

0 comments on commit d725ae1

Please sign in to comment.