PaddleOCR/README.md

English | [简体中文](README_cn.md)

## Introduction
PaddleOCR aims to create rich, leading, and practical OCR tools that help users train better models and apply them into practice.

**Recent updates**
- 2020.8.24 Support the use of PaddleOCR through whl package installation，pelease refer  [PaddleOCR Package](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md)
- 2020.8.16, Release text detection algorithm [SAST](https://arxiv.org/abs/1908.05498) and text recognition algorithm [SRN](https://arxiv.org/abs/2003.12294)
- 2020.7.23, Release the playback and PPT of live class on BiliBili station, PaddleOCR Introduction, [address](https://aistudio.baidu.com/aistudio/course/introduce/1519)
- 2020.7.15, Add mobile App demo , support both iOS and  Android  ( based on easyedge and Paddle Lite)
- 2020.7.15, Improve the  deployment ability, add the C + +  inference , serving deployment. In addition, the benchmarks of the ultra-lightweight OCR model are provided.
- 2020.7.15, Add several related datasets, data annotation and synthesis tools.
- [more](./doc/doc_en/update_en.md)

## Features
- Ultra-lightweight OCR model, total model size is only 8.6M
    - Single model supports Chinese/English numbers combination recognition, vertical text recognition, long text recognition
    - Detection model DB (4.1M) + recognition model CRNN (4.5M)
- Various text detection algorithms: EAST, DB
- Various text recognition algorithms: Rosetta, CRNN, STAR-Net, RARE
- Support Linux, Windows, macOS and other systems.

## Visualization

![](doc/imgs_results/11.jpg)

![](doc/imgs_results/img_10.jpg)

[More visualization](./doc/doc_en/visualization_en.md)

You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)

Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for  installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)

 Also, you can scan the QR code below to install the App (**Android support only**)

<div align="center">
<img src="./doc/ocr-android-easyedge.png"  width = "200" height = "200" />
</div>

- [**OCR Quick Start**](./doc/doc_en/quickstart_en.md)

<a name="Supported-Chinese-model-list"></a>

### Supported Models:

|Model Name|Description |Detection Model link|Recognition Model link| Support for space Recognition Model link|
|-|-|-|-|-|
|db_crnn_mobile|ultra-lightweight OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [pre-train model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
|db_crnn_server|General OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [pre-train model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)


## Tutorials
- [Installation](./doc/doc_en/installation_en.md)
- [Quick Start](./doc/doc_en/quickstart_en.md)
- Algorithm introduction
    - [Text Detection Algorithm](#TEXTDETECTIONALGORITHM)
    - [Text Recognition Algorithm](#TEXTRECOGNITIONALGORITHM)
    - [END-TO-END OCR Algorithm](#ENDENDOCRALGORITHM)
- Model training/evaluation
    - [Text Detection](./doc/doc_en/detection_en.md)
    - [Text Recognition](./doc/doc_en/recognition_en.md)
    - [Yml Configuration](./doc/doc_en/config_en.md)
    - [Tricks](./doc/doc_en/tricks_en.md)
- Deployment
    - [Python Inference](./doc/doc_en/inference_en.md)
    - [C++ Inference](./deploy/cpp_infer/readme_en.md)
    - [Serving](./doc/doc_en/serving_en.md)
    - [Mobile](./deploy/lite/readme_en.md)
    - Model Quantization and Compression (coming soon)
    - [Benchmark](./doc/doc_en/benchmark_en.md)
- Datasets
    - [General OCR Datasets(Chinese/English)](./doc/doc_en/datasets_en.md)
    - [HandWritten_OCR_Datasets(Chinese)](./doc/doc_en/handwritten_datasets_en.md)
    - [Various OCR Datasets(multilingual)](./doc/doc_en/vertical_and_multilingual_datasets_en.md)
    - [Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
    - [Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
- [FAQ](#FAQ)
- Visualization
    - [Ultra-lightweight Chinese/English OCR Visualization](#UCOCRVIS)
    - [General Chinese/English OCR Visualization](#GeOCRVIS)
    - [Chinese/English OCR Visualization (Support Space Recognition )](#SpaceOCRVIS)
- [Community](#Community)
- [References](./doc/doc_en/reference_en.md)
- [License](#LICENSE)
- [Contribution](#CONTRIBUTION)

<a name="TEXTDETECTIONALGORITHM"></a>
## Text Detection Algorithm

PaddleOCR open source text detection algorithms list:
- [x]  EAST([paper](https://arxiv.org/abs/1704.03155))
- [x]  DB([paper](https://arxiv.org/abs/1911.08947))
- [x]  SAST([paper](https://arxiv.org/abs/1908.05498))(Baidu Self-Research)

On the ICDAR2015 dataset, the text detection result is as follows:

|Model|Backbone|precision|recall|Hmean|Download link|
|-|-|-|-|-|-|
|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|
|EAST|MobileNetV3|81.67%|79.83%|80.74%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|
|DB|ResNet50_vd|83.79%|80.65%|82.19%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|
|DB|MobileNetV3|75.92%|73.18%|74.53%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|
|SAST|ResNet50_vd|92.18%|82.96%|87.33%|[Download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_icdar2015.tar)|

On Total-Text dataset, the text detection result is as follows:

|Model|Backbone|precision|recall|Hmean|Download link|
|-|-|-|-|-|-|
|SAST|ResNet50_vd|88.74%|79.80%|84.03%|[Download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_total_text.tar)|

**Note：** Additional data, like icdar2013, icdar2017, COCO-Text, ArT, was added to the model training of SAST. Download English public dataset in organized format used by PaddleOCR from [Baidu Drive](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (download code: 2bpi).

For use of [LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/datasets_en.md#1-icdar2019-lsvt) street view dataset with a total of 3w training data，the related configuration and pre-trained models for text detection task are as follows:  
|Model|Backbone|Configuration file|Pre-trained model|
|-|-|-|-|
|ultra-lightweight OCR model|MobileNetV3|det_mv3_db.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|
|General OCR model|ResNet50_vd|det_r50_vd_db.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|

* Note: For the training and evaluation of the above DB model, post-processing parameters box_thresh=0.6 and unclip_ratio=1.5 need to be set. If using different datasets and different models for training, these two parameters can be adjusted for better result.

For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./doc/doc_en/detection_en.md)

<a name="TEXTRECOGNITIONALGORITHM"></a>
## Text Recognition Algorithm

PaddleOCR open-source text recognition algorithms list:
- [x]  CRNN([paper](https://arxiv.org/abs/1507.05717))
- [x]  Rosetta([paper](https://arxiv.org/abs/1910.05085))
- [x]  STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
- [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))
- [x]  SRN([paper](https://arxiv.org/abs/2003.12294))(Baidu Self-Research)

Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:

|Model|Backbone|Avg Accuracy|Module combination|Download link|
|-|-|-|-|-|
|Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|
|Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|
|CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|
|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|
|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|
|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|
|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
|SRN|Resnet50_vd_fpn|88.33%|rec_r50fpn_vd_none_srn|[Download link](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar)|

**Note：** SRN model uses data expansion method to expand the two training sets mentioned above, and the expanded data can be downloaded from [Baidu Drive](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA) (download code: y3ry).

The average accuracy of the two-stage training in the original paper is 89.74%, and that of one stage training in paddleocr is 88.33%. Both pre-trained weights can be downloaded [here](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar).

We use [LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/datasets_en.md#1-icdar2019-lsvt) dataset and cropout 30w  training data from original photos by using position groundtruth and make some calibration needed. In addition, based on the LSVT corpus, 500w synthetic data is generated to train the model. The related configuration and pre-trained models are as follows:

|Model|Backbone|Configuration file|Pre-trained model|
|-|-|-|-|
|ultra-lightweight OCR model|MobileNetV3|rec_chinese_lite_train.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) & [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)|
|General OCR model|Resnet34_vd|rec_chinese_common_train.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) & [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)|

Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./doc/doc_en/recognition_en.md)

<a name="ENDENDOCRALGORITHM"></a>
## END-TO-END OCR Algorithm
- [ ]  [End2End-PSL](https://arxiv.org/abs/1909.07808)(Baidu Self-Research, coming soon)

## Visualization

<a name="UCOCRVIS"></a>
### 1.Ultra-lightweight Chinese/English OCR Visualization [more](./doc/doc_en/visualization_en.md)

<div align="center">
    <img src="doc/imgs_results/1.jpg" width="800">
</div>

<a name="GeOCRVIS"></a>
### 2. General Chinese/English OCR Visualization [more](./doc/doc_en/visualization_en.md)

<div align="center">
    <img src="doc/imgs_results/chinese_db_crnn_server/11.jpg" width="800">
</div>

<a name="SpaceOCRVIS"></a>
### 3.Chinese/English OCR Visualization (Space_support) [more](./doc/doc_en/visualization_en.md)

<div align="center">
    <img src="doc/imgs_results/chinese_db_crnn_server/en_paper.jpg" width="800">
</div>

<a name="FAQ"></a>

## FAQ
1. Error when using attention-based recognition model: KeyError: 'predict'

    The inference of recognition model based on attention loss is still being debugged. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss first. In practice, it is also found that the recognition model based on attention loss is not as effective as the one based on CTC loss.

2. About inference speed

    When there are a lot of texts in the picture, the prediction time will increase. You can use `--rec_batch_num` to set a smaller prediction batch size. The default value is 30, which can be changed to 10 or other values.

3. Service deployment and mobile deployment

    It is expected that the service deployment based on Serving and the mobile deployment based on Paddle Lite will be released successively in mid-to-late June. Stay tuned for more updates.

4. Release time of self-developed algorithm

    Baidu Self-developed algorithms such as SAST, SRN and end2end PSL will be released in June or July. Please be patient.

[more](./doc/doc_en/FAQ_en.md)

<a name="Community"></a>
## Community
Scan  the QR code below with your wechat and completing the questionnaire, you can access to offical technical exchange group.

<div align="center">
<img src="./doc/joinus.jpg"  width = "200" height = "200" />
</div>

<a name="LICENSE"></a>
## License
This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>

<a name="CONTRIBUTION"></a>
## Contribution
We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.

- Many thanks to [Khanh Tran](https://github.com/xxxpsyduck) and [Karl Horky](https://github.com/karlhorky) for contributing and revising the English documentation.
- Many thanks to [zhangxin](https://github.com/ZhangXinNan) for contributing the new visualize function、add .gitgnore and discard set PYTHONPATH manually.
- Many thanks to [lyl120117](https://github.com/lyl120117) for contributing the code for printing the network structure.
- Thanks [xiangyubo](https://github.com/xiangyubo) for contributing the handwritten Chinese OCR datasets.
- Thanks [authorfu](https://github.com/authorfu) for contributing Android demo  and [xiadeye](https://github.com/xiadeye) contributing iOS demo, respectively.
- Thanks [BeyondYourself](https://github.com/BeyondYourself) for contributing many great suggestions and simplifying part of the code style.
- Thanks [tangmq](https://gitee.com/tangmq) for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								English | [简体中文](README_cn.md)
-												Distinguish between English and Chinese documents

											
										
										
											2020-06-09 20:03:49 +08:00
-												Update README.md
											
										
										
											2020-07-21 15:47:14 +08:00
+								## Introduction
 								PaddleOCR aims to create rich, leading, and practical OCR tools that help users train better models and apply them into practice.
-												Update README.md
											
										
										
											2020-07-20 20:26:02 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								**Recent updates**
-												Update README.md
											
										
										
											2020-08-24 20:59:52 +08:00
+								- 2020.8.24 Support the use of PaddleOCR through whl package installation，pelease refer  [PaddleOCR Package](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md)
-												add total-text metrics

											
										
										
											2020-08-16 20:07:01 +08:00
+								- 2020.8.16, Release text detection algorithm [SAST](https://arxiv.org/abs/1908.05498) and text recognition algorithm [SRN](https://arxiv.org/abs/2003.12294)
-												Update README.md
											
										
										
											2020-07-23 10:11:45 +08:00
+								- 2020.7.23, Release the playback and PPT of live class on BiliBili station, PaddleOCR Introduction, [address](https://aistudio.baidu.com/aistudio/course/introduce/1519)
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								- 2020.7.15, Add mobile App demo , support both iOS and  Android  ( based on easyedge and Paddle Lite)
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								- 2020.7.15, Improve the  deployment ability, add the C + +  inference , serving deployment. In addition, the benchmarks of the ultra-lightweight OCR model are provided.
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								- 2020.7.15, Add several related datasets, data annotation and synthesis tools.
 								- [more](./doc/doc_en/update_en.md)
-												Update README.md
											
										
										
											2020-07-16 13:58:39 +08:00
-												Update README.md
											
										
										
											2020-07-17 16:03:36 +08:00
+								## Features
-												Update README.md
											
										
										
											2020-07-17 16:52:14 +08:00
+								- Ultra-lightweight OCR model, total model size is only 8.6M
-												Update README.md
											
										
										
											2020-07-17 16:53:13 +08:00
+								    - Single model supports Chinese/English numbers combination recognition, vertical text recognition, long text recognition
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								    - Detection model DB (4.1M) + recognition model CRNN (4.5M)
 								- Various text detection algorithms: EAST, DB
 								- Various text recognition algorithms: Rosetta, CRNN, STAR-Net, RARE
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								- Support Linux, Windows, macOS and other systems.
-												fix mainpage

											
										
										
											2020-05-14 11:05:33 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								## Visualization
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								![](doc/imgs_results/11.jpg)
-												add demo img

											
										
										
											2020-05-14 01:51:17 +08:00
-												Update README.md
											
										
										
											2020-07-17 17:04:31 +08:00
+								![](doc/imgs_results/img_10.jpg)
-												Update README.md
											
										
										
											2020-07-17 17:03:11 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								[More visualization](./doc/doc_en/visualization_en.md)
-												fix mainpage

											
										
										
											2020-05-14 11:05:33 +08:00
-												Update README.md
											
										
										
											2020-07-17 16:52:14 +08:00
+								You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
-												add ocr-android-easyedge demo qrcode

											
										
										
											2020-07-14 20:27:31 +08:00
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for  installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								 Also, you can scan the QR code below to install the App (**Android support only**)
-												add ocr-android-easyedge demo qrcode

											
										
										
											2020-07-14 20:27:31 +08:00
 								<div align="center">
 								<img src="./doc/ocr-android-easyedge.png"  width = "200" height = "200" />
 								</div>
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								- [**OCR Quick Start**](./doc/doc_en/quickstart_en.md)
-												add benchmark & mobile demo qr code

											
										
										
											2020-07-13 22:19:42 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="Supported-Chinese-model-list"></a>
-												add benchmark & mobile demo qr code

											
										
										
											2020-07-13 22:19:42 +08:00
-												Update README.md
											
										
										
											2020-07-17 16:52:14 +08:00
+								### Supported Models:
-												updata doc of infer

											
										
										
											2020-05-15 22:07:18 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|Model Name|Description |Detection Model link|Recognition Model link| Support for space Recognition Model link|
-												updata readme

											
										
										
											2020-07-10 23:22:32 +08:00
+								|-|-|-|-|-|
-												Update README.md
											
										
										
											2020-07-17 16:54:13 +08:00
+								|db_crnn_mobile|ultra-lightweight OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [pre-train model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
 								|db_crnn_server|General OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [pre-train model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
 								## Tutorials
 								- [Installation](./doc/doc_en/installation_en.md)
 								- [Quick Start](./doc/doc_en/quickstart_en.md)
 								- Algorithm introduction
 								    - [Text Detection Algorithm](#TEXTDETECTIONALGORITHM)
 								    - [Text Recognition Algorithm](#TEXTRECOGNITIONALGORITHM)
 								    - [END-TO-END OCR Algorithm](#ENDENDOCRALGORITHM)
 								- Model training/evaluation
 								    - [Text Detection](./doc/doc_en/detection_en.md)
 								    - [Text Recognition](./doc/doc_en/recognition_en.md)
 								    - [Yml Configuration](./doc/doc_en/config_en.md)
 								    - [Tricks](./doc/doc_en/tricks_en.md)
-												Update README.md
											
										
										
											2020-07-17 15:53:14 +08:00
+								- Deployment
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								    - [Python Inference](./doc/doc_en/inference_en.md)
 								    - [C++ Inference](./deploy/cpp_infer/readme_en.md)
 								    - [Serving](./doc/doc_en/serving_en.md)
 								    - [Mobile](./deploy/lite/readme_en.md)
 								    - Model Quantization and Compression (coming soon)
 								    - [Benchmark](./doc/doc_en/benchmark_en.md)
-												Update README.md
											
										
										
											2020-07-17 15:53:14 +08:00
+								- Datasets
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								    - [General OCR Datasets(Chinese/English)](./doc/doc_en/datasets_en.md)
 								    - [HandWritten_OCR_Datasets(Chinese)](./doc/doc_en/handwritten_datasets_en.md)
 								    - [Various OCR Datasets(multilingual)](./doc/doc_en/vertical_and_multilingual_datasets_en.md)
 								    - [Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
 								    - [Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
-												Update README.md
											
										
										
											2020-07-12 09:20:21 +08:00
+								- [FAQ](#FAQ)
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								- Visualization
 								    - [Ultra-lightweight Chinese/English OCR Visualization](#UCOCRVIS)
 								    - [General Chinese/English OCR Visualization](#GeOCRVIS)
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								    - [Chinese/English OCR Visualization (Support Space Recognition )](#SpaceOCRVIS)
-												Update README.md
											
										
										
											2020-07-17 16:03:36 +08:00
+								- [Community](#Community)
 								- [References](./doc/doc_en/reference_en.md)
 								- [License](#LICENSE)
 								- [Contribution](#CONTRIBUTION)
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
 								<a name="TEXTDETECTIONALGORITHM"></a>
 								## Text Detection Algorithm
 								PaddleOCR open source text detection algorithms list:
-												update readme url

											
										
										
											2020-05-14 12:04:43 +08:00
+								- [x]  EAST([paper](https://arxiv.org/abs/1704.03155))
-												fix url

											
										
										
											2020-05-14 12:07:22 +08:00
+								- [x]  DB([paper](https://arxiv.org/abs/1911.08947))
-												modify docs for updates

											
										
										
											2020-08-16 19:44:07 +08:00
+								- [x]  SAST([paper](https://arxiv.org/abs/1908.05498))(Baidu Self-Research)
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								On the ICDAR2015 dataset, the text detection result is as follows:
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|Model|Backbone|precision|recall|Hmean|Download link|
-												fix predict_det not found unclip_ratio

											
										
										
											2020-05-25 18:14:13 +08:00
+								|-|-|-|-|-|-|
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|
 								|EAST|MobileNetV3|81.67%|79.83%|80.74%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|
 								|DB|ResNet50_vd|83.79%|80.65%|82.19%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|
 								|DB|MobileNetV3|75.92%|73.18%|74.53%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|
-												modify docs for updates

											
										
										
											2020-08-16 19:44:07 +08:00
+								|SAST|ResNet50_vd|92.18%|82.96%|87.33%|[Download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_icdar2015.tar)|
-												solve det eval bug and optimize doc

											
										
										
											2020-05-25 16:29:20 +08:00
-												add total-text metrics

											
										
										
											2020-08-16 20:07:01 +08:00
+								On Total-Text dataset, the text detection result is as follows:
 								|Model|Backbone|precision|recall|Hmean|Download link|
 								|-|-|-|-|-|-|
 								|SAST|ResNet50_vd|88.74%|79.80%|84.03%|[Download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_total_text.tar)|
-												fix bug in predict_det for sast & update docs

											
										
										
											2020-08-18 16:05:12 +08:00
+								**Note：** Additional data, like icdar2013, icdar2017, COCO-Text, ArT, was added to the model training of SAST. Download English public dataset in organized format used by PaddleOCR from [Baidu Drive](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (download code: 2bpi).
 								For use of [LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/datasets_en.md#1-icdar2019-lsvt) street view dataset with a total of 3w training data，the related configuration and pre-trained models for text detection task are as follows:
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|Model|Backbone|Configuration file|Pre-trained model|
-												update doc

											
										
										
											2020-06-08 16:51:05 +08:00
+								|-|-|-|-|
-												Update README.md
											
										
										
											2020-07-17 16:52:14 +08:00
+								|ultra-lightweight OCR model|MobileNetV3|det_mv3_db.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|
 								|General OCR model|ResNet50_vd|det_r50_vd_db.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|
-												update doc

											
										
										
											2020-06-08 16:51:05 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								* Note: For the training and evaluation of the above DB model, post-processing parameters box_thresh=0.6 and unclip_ratio=1.5 need to be set. If using different datasets and different models for training, these two parameters can be adjusted for better result.
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./doc/doc_en/detection_en.md)
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="TEXTRECOGNITIONALGORITHM"></a>
 								## Text Recognition Algorithm
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								PaddleOCR open-source text recognition algorithms list:
-												update readme url

											
										
										
											2020-05-14 12:04:43 +08:00
+								- [x]  CRNN([paper](https://arxiv.org/abs/1507.05717))
 								- [x]  Rosetta([paper](https://arxiv.org/abs/1910.05085))
 								- [x]  STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
 								- [x]  RARE([paper](https://arxiv.org/abs/1603.03915v1))
-												modify docs for updates

											
										
										
											2020-08-16 19:44:07 +08:00
+								- [x]  SRN([paper](https://arxiv.org/abs/2003.12294))(Baidu Self-Research)
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|Model|Backbone|Avg Accuracy|Module combination|Download link|
-												Update README.md
											
										
										
											2020-05-15 19:51:49 +08:00
+								|-|-|-|-|-|
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|
 								|Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|
 								|CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|
 								|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|
 								|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|
 								|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|
 								|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
 								|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
-												modify docs for updates

											
										
										
											2020-08-16 19:44:07 +08:00
+								|SRN|Resnet50_vd_fpn|88.33%|rec_r50fpn_vd_none_srn|[Download link](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar)|
-												fix bug in predict_det for sast & update docs

											
										
										
											2020-08-18 16:05:12 +08:00
+								**Note：** SRN model uses data expansion method to expand the two training sets mentioned above, and the expanded data can be downloaded from [Baidu Drive](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA) (download code: y3ry).
-												modify docs for updates

											
										
										
											2020-08-16 19:44:07 +08:00
 								The average accuracy of the two-stage training in the original paper is 89.74%, and that of one stage training in paddleocr is 88.33%. Both pre-trained weights can be downloaded [here](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar).
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								We use [LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/datasets_en.md#1-icdar2019-lsvt) dataset and cropout 30w  training data from original photos by using position groundtruth and make some calibration needed. In addition, based on the LSVT corpus, 500w synthetic data is generated to train the model. The related configuration and pre-trained models are as follows:
-												modify docs for updates

											
										
										
											2020-08-16 19:44:07 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								|Model|Backbone|Configuration file|Pre-trained model|
-												update doc

											
										
										
											2020-06-08 16:51:05 +08:00
+								|-|-|-|-|
-												Update README.md
											
										
										
											2020-07-17 16:52:14 +08:00
+								|ultra-lightweight OCR model|MobileNetV3|rec_chinese_lite_train.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) & [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)|
 								|General OCR model|Resnet34_vd|rec_chinese_common_train.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) & [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)|
-												update doc

											
										
										
											2020-06-08 16:51:05 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./doc/doc_en/recognition_en.md)
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="ENDENDOCRALGORITHM"></a>
 								## END-TO-END OCR Algorithm
-												Fix spelling errors
											
										
										
											2020-08-19 20:45:32 +08:00
+								- [ ]  [End2End-PSL](https://arxiv.org/abs/1909.07808)(Baidu Self-Research, coming soon)
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								## Visualization
-												updata readme

											
										
										
											2020-07-10 23:22:32 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="UCOCRVIS"></a>
 								### 1.Ultra-lightweight Chinese/English OCR Visualization [more](./doc/doc_en/visualization_en.md)
-												update readme

											
										
										
											2020-07-08 14:39:48 +08:00
-												Update README.md
											
										
										
											2020-07-12 14:45:03 +08:00
+								<div align="center">
-												Update README.md
											
										
										
											2020-07-15 10:37:40 +08:00
+								    <img src="doc/imgs_results/1.jpg" width="800">
-												Update README.md
											
										
										
											2020-07-12 14:45:03 +08:00
+								</div>
-												update doc

											
										
										
											2020-05-14 00:01:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="GeOCRVIS"></a>
 								### 2. General Chinese/English OCR Visualization [more](./doc/doc_en/visualization_en.md)
-												Update README.md
											
										
										
											2020-07-12 14:45:03 +08:00
 								<div align="center">
 								    <img src="doc/imgs_results/chinese_db_crnn_server/11.jpg" width="800">
 								</div>
-												show results from chinese_db_crnn_server

											
										
										
											2020-06-02 16:47:04 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="SpaceOCRVIS"></a>
 								### 3.Chinese/English OCR Visualization (Space_support) [more](./doc/doc_en/visualization_en.md)
-												add visualization for enhance model

											
										
										
											2020-07-06 21:10:49 +08:00
-												Update README.md
											
										
										
											2020-07-12 14:45:03 +08:00
+								<div align="center">
 								    <img src="doc/imgs_results/chinese_db_crnn_server/en_paper.jpg" width="800">
 								</div>
-												add visualization for enhance model

											
										
										
											2020-07-06 21:10:49 +08:00
-												Update README.md
											
										
										
											2020-06-23 19:59:35 +08:00
+								<a name="FAQ"></a>
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
-												add faq & wechat code

											
										
										
											2020-06-02 20:33:26 +08:00
+								## FAQ
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+. Error when using attention-based recognition model: KeyError: 'predict'
 								    The inference of recognition model based on attention loss is still being debugged. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss first. In practice, it is also found that the recognition model based on attention loss is not as effective as the one based on CTC loss.
 . About inference speed
 								    When there are a lot of texts in the picture, the prediction time will increase. You can use `--rec_batch_num` to set a smaller prediction batch size. The default value is 30, which can be changed to 10 or other values.
 . Service deployment and mobile deployment
-												add update doc

											
										
										
											2020-06-05 13:21:56 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								    It is expected that the service deployment based on Serving and the mobile deployment based on Paddle Lite will be released successively in mid-to-late June. Stay tuned for more updates.
-												Update README.md
											
										
										
											2020-06-03 11:23:55 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+. Release time of self-developed algorithm
-												add update doc

											
										
										
											2020-06-05 13:21:56 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								    Baidu Self-developed algorithms such as SAST, SRN and end2end PSL will be released in June or July. Please be patient.
-												Update README.md
											
										
										
											2020-06-03 11:23:55 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								[more](./doc/doc_en/FAQ_en.md)
-												add faq & wechat code

											
										
										
											2020-06-02 20:33:26 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="Community"></a>
-												Update README.md
											
										
										
											2020-07-17 16:03:36 +08:00
+								## Community
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								Scan  the QR code below with your wechat and completing the questionnaire, you can access to offical technical exchange group.
-												Update README.md
											
										
										
											2020-07-14 11:40:31 +08:00
-												update qrcode

											
										
										
											2020-07-14 10:21:23 +08:00
+								<div align="center">
 								<img src="./doc/joinus.jpg"  width = "200" height = "200" />
 								</div>
-												Update README.md
											
										
										
											2020-06-23 17:39:50 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="LICENSE"></a>
-												Update README.md
											
										
										
											2020-07-17 16:03:36 +08:00
+								## License
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>
-												Update README.md
											
										
										
											2020-05-14 01:58:00 +08:00
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								<a name="CONTRIBUTION"></a>
-												Update README.md
											
										
										
											2020-07-17 16:03:36 +08:00
+								## Contribution
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.
-												add thanks

											
										
										
											2020-06-10 13:23:58 +08:00
-												Update README.md
											
										
										
											2020-08-20 10:35:41 +08:00
+								- Many thanks to [Khanh Tran](https://github.com/xxxpsyduck) and [Karl Horky](https://github.com/karlhorky) for contributing and revising the English documentation.
-												change en doc

											
										
										
											2020-07-17 15:49:30 +08:00
+								- Many thanks to [zhangxin](https://github.com/ZhangXinNan) for contributing the new visualize function、add .gitgnore and discard set PYTHONPATH manually.
 								- Many thanks to [lyl120117](https://github.com/lyl120117) for contributing the code for printing the network structure.
 								- Thanks [xiangyubo](https://github.com/xiangyubo) for contributing the handwritten Chinese OCR datasets.
 								- Thanks [authorfu](https://github.com/authorfu) for contributing Android demo  and [xiadeye](https://github.com/xiadeye) contributing iOS demo, respectively.
-												add beyond into thanks

											
										
										
											2020-07-29 23:03:37 +08:00
+								- Thanks [BeyondYourself](https://github.com/BeyondYourself) for contributing many great suggestions and simplifying part of the code style.
-												add contributor

											
										
										
											2020-08-17 19:09:13 +08:00
+								- Thanks [tangmq](https://gitee.com/tangmq) for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.