Merge branch 'develop' into fix_typo_and_polish_infer

This commit is contained in:
xiaoting 2020-09-21 21:16:08 +08:00 committed by GitHub
commit e8ba14955c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
16 changed files with 91 additions and 55 deletions

View File

@ -69,6 +69,7 @@ For more model downloads (including multiple languages), please refer to [PP-OCR
- Model training/evaluation
- [Text Detection](./doc/doc_en/detection_en.md)
- [Text Recognition](./doc/doc_en/recognition_en.md)
- [Direction Classification](./doc/doc_en/angle_class_en.md)
- [Yml Configuration](./doc/doc_en/config_en.md)
- Inference and Deployment
- [Quick inference based on pip](./doc/doc_en/whl_en.md)

View File

@ -67,6 +67,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库助力
- 模型训练/评估
- [文本检测](./doc/doc_ch/detection.md)
- [文本识别](./doc/doc_ch/recognition.md)
- [方向分类器](./doc/doc_ch/angle_class.md)
- [yml参数配置文件介绍](./doc/doc_ch/config.md)
- 预测部署
- [基于pip安装whl包快速推理](./doc/doc_ch/whl.md)
@ -75,7 +76,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库助力
- [服务化部署](./deploy/hubserving/readme.md)
- [端侧部署](./deploy/lite/readme.md)
- [模型量化](./deploy/slim/quantization/README.md)
- [模型裁剪](./deploy/slim/prune/README_ch.md)
- [模型裁剪](./deploy/slim/prune/README.md)
- [Benchmark](./doc/doc_ch/benchmark.md)
- 数据集
- [通用中英文OCR数据集](./doc/doc_ch/datasets.md)

View File

@ -43,6 +43,10 @@ Optimizer:
base_lr: 0.001
beta1: 0.9
beta2: 0.999
decay:
function: cosine_decay_warmup
step_each_epoch: 32
total_epoch: 1200
PostProcess:
function: ppocr.postprocess.db_postprocess,DBPostProcess

View File

@ -3,7 +3,8 @@
复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型裁剪通过移出网络模型中的子模型来减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。
本教程将介绍如何使用PaddleSlim量化PaddleOCR的模型。
本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。
PaddleSlim项目链接https://github.com/PaddlePaddle/PaddleSlim集成了模型剪枝、量化包括量化训练和离线量化、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能如果您感兴趣可以关注并了解。
在开始本教程之前,建议先了解
1. [PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md)
@ -33,8 +34,20 @@ python setup.py install
### 3. 敏感度分析训练
加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,了解各网络层冗余度,从而决定每个网络层的裁剪比例。
加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,得到敏感度文件sensitivities_0.data可以通过PaddleSlim提供的[接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221)加载文件,获得各网络层在不同裁剪比例下的精度损失。从而了解各网络层冗余度,决定每个网络层的裁剪比例。
敏感度分析的具体细节见:[敏感度分析](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
敏感度文件内容格式:
sensitivities_0.data(Dict){
'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss}
'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss}
}
例子:
{
'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594}
'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405}
}
加载敏感度文件后会返回一个字典字典中的keys为网络模型参数模型的名字values为一个字典里面保存了相应网络层的裁剪敏感度信息。例如在例子中conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86)
进入PaddleOCR根目录通过以下命令对模型进行敏感度分析训练
```bash
@ -42,7 +55,7 @@ python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Gl
```
### 4. 模型裁剪训练
裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时为了尽可能多的保留从图像中提取的低阶特征我们跳过了backbone中靠近输入的4个卷积层。同样为了减少由于裁剪导致的模型性能损失我们通过之前敏感度分析所获得的敏感度表挑选出了一些冗余较少对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41)并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。
裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时为了尽可能多的保留从图像中提取的低阶特征我们跳过了backbone中靠近输入的4个卷积层。同样为了减少由于裁剪导致的模型性能损失我们通过之前敏感度分析所获得的敏感度表人工挑选出了一些冗余较少,对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41)(指在较低的裁剪比例下就导致很高性能损失的网络层)并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。
```bash
python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1

View File

@ -115,6 +115,7 @@ Compress results
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance.
This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model.
PaddleSlim (GitHub: https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry.
It is recommended that you could understand following pages before reading this example,
@ -146,7 +147,20 @@ python setup.py install
## Pruning sensitivity analysis
After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, thereby determining the pruning ratio of each network layer. For specific details of sensitivity analysis, see[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sensitivities_0.data. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
The data format of sensitivity file
sensitivities_0.data(Dict){
'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss}
'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss}
}
example
{
'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594}
'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405}
}
The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of correspoding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86)
Enter the PaddleOCR root directoryperform sensitivity analysis on the model with the following command

View File

@ -3,7 +3,8 @@
复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。
模型量化可以在基本不损失模型的精度的情况下将FP32精度的模型参数转换为Int8精度减小模型参数大小并加速计算使用量化后的模型在移动端等部署时更具备速度优势。
本教程将介绍如何使用PaddleSlim量化PaddleOCR的模型。
本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。
PaddleSlim项目链接https://github.com/PaddlePaddle/PaddleSlim集成了模型剪枝、量化包括量化训练和离线量化、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能如果您感兴趣可以关注并了解。
在开始本教程之前,建议先了解[PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html)

View File

@ -116,6 +116,7 @@ Compress results
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.
This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model.
PaddleSlim (GitHub: https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry.
It is recommended that you could understand following pages before reading this example,

View File

@ -10,7 +10,7 @@
## 配置文件 Global 参数介绍
`rec_chinese_lite_train.yml` 为例
`rec_chinese_lite_train_v1.1.yml ` 为例
| 字段 | 用途 | 默认值 | 备注 |
@ -32,6 +32,7 @@
| loss_type | 设置 loss 类型 | ctc | 支持两种loss ctc / attention |
| distort | 设置是否使用数据增强 | false | 设置为true时将在训练时随机进行扰动支持的扰动操作可阅读[img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py) |
| use_space_char | 设置是否识别空格 | false | 仅在 character_type=ch 时支持空格 |
| label_list | 设置方向分类器支持的角度 | ['0','180'] | 仅在方向分类器中生效 |
| average_window | ModelAverage优化器中的窗口长度计算比例 | 0.15 | 目前仅应用与SRN |
| max_average_window | 平均值计算窗口长度的最大值 | 15625 | 推荐设置为一轮训练中mini-batchs的数目|
| min_average_window | 平均值计算窗口长度的最小值 | 10000 | \ |

View File

@ -44,13 +44,15 @@ json.dumps编码前的图像标注信息是包含多个字典的list字典中
## 快速启动训练
首先下载模型backbone的pretrain modelPaddleOCR的检测模型目前支持两种backbone分别是MobileNetV3、ResNet50_vd
首先下载模型backbone的pretrain modelPaddleOCR的检测模型目前支持两种backbone分别是MobileNetV3、ResNet_vd系列
您可以根据需求使用[PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures)中的模型更换backbone。
```shell
cd PaddleOCR/
# 下载MobileNetV3的预训练模型
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV3_large_x0_5_pretrained.tar
# 下载ResNet50的预训练模型
# 或下载ResNet18_vd的预训练模型
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_vd_pretrained.tar
# 或下载ResNet50_vd的预训练模型
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar
# 解压预训练模型文件以MobileNetV3为例
@ -72,24 +74,24 @@ tar -xf ./pretrain_models/MobileNetV3_large_x0_5_pretrained.tar ./pretrain_model
```shell
# 训练 mv3_db 模型,并将训练日志保存为 tain_det.log
python3 tools/train.py -c configs/det/det_mv3_db.yml \
python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml \
-o Global.pretrain_weights=./pretrain_models/MobileNetV3_large_x0_5_pretrained/ \
2>&1 | tee train_det.log
```
上述指令中,通过-c 选择训练使用configs/det/det_db_mv3.yml配置文件。
上述指令中,通过-c 选择训练使用configs/det/det_db_mv3_v1.1.yml配置文件。
有关配置文件的详细解释,请参考[链接](./config.md)。
您也可以通过-o参数在不需要修改yml文件的情况下改变训练的参数比如调整训练的学习率为0.0001
```shell
python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001
python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Optimizer.base_lr=0.0001
```
#### 断点训练
如果训练程序中断如果希望加载训练中断的模型从而恢复训练可以通过指定Global.checkpoints指定要加载的模型路径
```shell
python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./your/trained/model
python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints=./your/trained/model
```
**注意**`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。
@ -98,17 +100,17 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./you
PaddleOCR计算三个OCR检测相关的指标分别是Precision、Recall、Hmean。
运行如下代码,根据配置文件`det_db_mv3.yml`中`save_res_path`指定的测试集检测结果文件,计算评估指标。
运行如下代码,根据配置文件`det_db_mv3_v1.1.yml`中`save_res_path`指定的测试集检测结果文件,计算评估指标。
评估时设置后处理参数`box_thresh=0.6``unclip_ratio=1.5`,使用不同数据集、不同模型训练,可调整这两个参数进行优化
```shell
python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
```
训练中模型参数默认保存在`Global.save_model_dir`目录下。在评估指标时,需要设置`Global.checkpoints`指向保存的参数文件。
比如:
```shell
python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
```
* 注:`box_thresh`、`unclip_ratio`是DB后处理所需要的参数在评估EAST模型时不需要设置
@ -117,16 +119,16 @@ python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./ou
测试单张图像的检测效果
```shell
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy"
python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy"
```
测试DB模型时调整后处理阈值
```shell
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
```
测试文件夹下所有图像的检测效果
```shell
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy"
python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy"
```

View File

@ -2,7 +2,7 @@
经测试PaddleOCR可在glibc 2.23上运行您也可以测试其他glibc版本或安装glic 2.23
PaddleOCR 工作环境
- PaddlePaddle 1.7+
- PaddlePaddle 1.8+ ,推荐使用 PaddlePaddle 2.0.0.beta
- python3.7
- glibc 2.23
- cuDNN 7.6+ (GPU)
@ -47,19 +47,16 @@ docker images
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
```
**2. 安装PaddlePaddle Fluid v1.7**
**2. 安装PaddlePaddle Fluid v2.0**
```
pip3 install --upgrade pip
如果您的机器安装的是CUDA9请运行以下命令安装
python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple
如果您的机器安装的是CUDA10请运行以下命令安装
python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple
如果您的机器安装的是CUDA9或CUDA10请运行以下命令安装
python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple
如果您的机器是CPU请运行以下命令安装
python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
python3 -m pip install paddlepaddle==2.0.0b0 -i https://mirror.baidu.com/pypi/simple
更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
```

View File

@ -259,6 +259,7 @@ PaddleOCR也提供了多语言的 `configs/rec/multi_languages` 路径下的
| rec_japan_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 日语 |
| rec_korean_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 韩语 |
多语言模型训练方式与中文模型一致训练数据集均为100w的合成数据少量的字体可以在 [百度网盘](https://pan.baidu.com/s/1bS_u207Rm7YbY33wOECKDA) 上下载提取码frgi。
如您希望在现有模型效果的基础上调优,请参考下列说明修改配置文件:

View File

@ -11,8 +11,8 @@ pip install paddleocr
本地构建并安装
```bash
python setup.py bdist_wheel
pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号
python3 setup.py bdist_wheel
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号
```
### 1. 代码使用
@ -20,7 +20,7 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本
```python
from paddleocr import PaddleOCR, draw_ocr
# Paddleocr目前支持中英文、英文、法语、德语、韩语、日语可以通过修改lang参数进行切换
# 参数依次为`zh`, `en`, `french`, `german`, `korean`, `japan`
# 参数依次为`ch`, `en`, `french`, `german`, `korean`, `japan`
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
result = ocr.ocr(img_path, cls=True)
@ -280,7 +280,7 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_
| rec_algorithm | 使用的识别算法类型 | CRNN |
| rec_model_dir | 识别模型所在文件夹。传参方式有两种1. None: 自动下载内置模型到 `~/.paddleocr/rec`2.自己转换好的inference模型路径模型路径下必须包含model和params文件 | None |
| rec_image_shape | 识别算法的输入图片尺寸 | "3,32,320" |
| rec_char_type | 识别算法的字符类型,中文(ch)或英文(en) | ch |
| rec_char_type | 识别算法的字符类型,中英文(ch)、英文(en)、法语(french)、德语(german)、韩语(korean)、日语(japan) | ch |
| rec_batch_num | 进行识别时,同时前向的图片数 | 30 |
| max_text_length | 识别算法能识别的最大文字长度 | 25 |
| rec_char_dict_path | 识别模型字典路径当rec_model_dir使用方式2传参时需要修改为自己的字典路径 | ./ppocr/utils/ppocr_keys_v1.txt |

View File

@ -10,7 +10,7 @@ The following list can be viewed via `--help`
## INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE
Take `rec_chinese_lite_train.yml` as an example
Take `rec_chinese_lite_train_v1.1.yml` as an example
| Parameter | Use | Default | Note |
@ -32,6 +32,7 @@ Take `rec_chinese_lite_train.yml` as an example
| loss_type | Set loss type | ctc | Supports two types of loss: ctc / attention |
| distort | Set use distort | false | Support distort type ,read [img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py) |
| use_space_char | Wether to recognize space | false | Only support in character_type=ch mode |
label_list | Set the angle supported by the direction classifier | ['0','180'] | Only valid in the direction classifier |
| reader_yml | Set the reader configuration file | ./configs/rec/rec_icdar15_reader.yml | \ |
| pretrain_weights | Load pre-trained model path | ./pretrain_models/CRNN/best_accuracy | \ |
| checkpoints | Load saved model path | None | Used to load saved parameters to continue training after interruption |

View File

@ -38,12 +38,14 @@ If you want to train PaddleOCR on other datasets, please build the annotation fi
## TRAINING
First download the pretrained model. The detection model of PaddleOCR currently supports two backbones, namely MobileNetV3 and ResNet50_vd. You can use the model in [PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures) to replace backbone according to your needs.
First download the pretrained model. The detection model of PaddleOCR currently supports 3 backbones, namely MobileNetV3, ResNet18_vd and ResNet50_vd. You can use the model in [PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures) to replace backbone according to your needs.
```shell
cd PaddleOCR/
# Download the pre-trained model of MobileNetV3
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV3_large_x0_5_pretrained.tar
# Download the pre-trained model of ResNet50
# or, download the pre-trained model of ResNet18_vd
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_vd_pretrained.tar
# or, download the pre-trained model of ResNet50_vd
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar
# decompressing the pre-training model file, take MobileNetV3 as an example
@ -62,7 +64,7 @@ tar -xf ./pretrain_models/MobileNetV3_large_x0_5_pretrained.tar ./pretrain_model
#### START TRAINING
*If CPU version installed, please set the parameter `use_gpu` to `false` in the configuration.*
```shell
python3 tools/train.py -c configs/det/det_mv3_db.yml 2>&1 | tee train_det.log
python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml 2>&1 | tee train_det.log
```
In the above instruction, use `-c` to select the training to use the `configs/det/det_db_mv3.yml` configuration file.
@ -70,7 +72,7 @@ For a detailed explanation of the configuration file, please refer to [config](.
You can also use `-o` to change the training parameters without modifying the yml file. For example, adjust the training learning rate to 0.0001
```shell
python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001
python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Optimizer.base_lr=0.0001
```
#### load trained model and continue training
@ -78,7 +80,7 @@ If you expect to load trained model and continue the training again, you can spe
For example:
```shell
python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./your/trained/model
python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints=./your/trained/model
```
**Note**: The priority of `Global.checkpoints` is higher than that of `Global.pretrain_weights`, that is, when two parameters are specified at the same time, the model specified by `Global.checkpoints` will be loaded first. If the model path specified by `Global.checkpoints` is wrong, the one specified by `Global.pretrain_weights` will be loaded.
@ -88,18 +90,18 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./you
PaddleOCR calculates three indicators for evaluating performance of OCR detection task: Precision, Recall, and Hmean.
Run the following code to calculate the evaluation indicators. The result will be saved in the test result file specified by `save_res_path` in the configuration file `det_db_mv3.yml`
Run the following code to calculate the evaluation indicators. The result will be saved in the test result file specified by `save_res_path` in the configuration file `det_db_mv3_v1.1.yml`
When evaluating, set post-processing parameters `box_thresh=0.6`, `unclip_ratio=1.5`. If you use different datasets, different models for training, these two parameters should be adjusted for better result.
```shell
python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
```
The model parameters during training are saved in the `Global.save_model_dir` directory by default. When evaluating indicators, you need to set `Global.checkpoints` to point to the saved parameter file.
Such as:
```shell
python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
```
* Note: `box_thresh` and `unclip_ratio` are parameters required for DB post-processing, and not need to be set when evaluating the EAST model.
@ -108,16 +110,16 @@ python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./ou
Test the detection result on a single image:
```shell
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy"
python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy"
```
When testing the DB model, adjust the post-processing threshold:
```shell
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
```
Test the detection result on all images in the folder:
```shell
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy"
python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy"
```

View File

@ -3,7 +3,7 @@
After testing, paddleocr can run on glibc 2.23. You can also test other glibc versions or install glic 2.23 for the best compatibility.
PaddleOCR working environment:
- PaddlePaddle1.7
- PaddlePaddle1.8+, Recommend PaddlePaddle 2.0.0.beta
- python3.7
- glibc 2.23
@ -49,18 +49,15 @@ docker images
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
```
**2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress)**
**2. Install PaddlePaddle Fluid v2.0 **
```
pip3 install --upgrade pip
# If you have cuda9 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple
# If you have cuda10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple
# If you have cuda9 or cuda10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple
# If you only have cpu on your machine, please run the following command to install
python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
python3 -m pip install paddlepaddle==2.0.0b0 -i https://mirror.baidu.com/pypi/simple
```
For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.

View File

@ -9,8 +9,8 @@ pip install paddleocr
build own whl package and install
```bash
python setup.py bdist_wheel
pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
python3 setup.py bdist_wheel
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
```
### 1. Use by code
@ -18,7 +18,7 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of padd
```python
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `zh`, `en`, `french`, `german`, `korean`, `japan`
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
@ -302,7 +302,7 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_
| cls_batch_num | When performing classification, the batchsize of forward images | 30 |
| enable_mkldnn | Whether to enable mkldnn | FALSE |
| use_zero_copy_run | Whether to forward by zero_copy_run | FALSE |
| lang | The support language, now only chinese(ch) and english(en) are supported | ch |
| lang | The support language, now only Chinese(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan) are supported | ch |
| det | Enable detction when `ppocr.ocr` func exec | TRUE |
| rec | Enable recognition when `ppocr.ocr` func exec | TRUE |
| cls | Enable classification when `ppocr.ocr` func exec | FALSE |