diff --git a/README.md b/README.md index 9eb21758..1165b10d 100644 --- a/README.md +++ b/README.md @@ -69,6 +69,7 @@ For more model downloads (including multiple languages), please refer to [PP-OCR - Model training/evaluation - [Text Detection](./doc/doc_en/detection_en.md) - [Text Recognition](./doc/doc_en/recognition_en.md) + - [Direction Classification](./doc/doc_en/angle_class_en.md) - [Yml Configuration](./doc/doc_en/config_en.md) - Inference and Deployment - [Quick inference based on pip](./doc/doc_en/whl_en.md) diff --git a/README_ch.md b/README_ch.md index 0106faab..79ac2a1b 100644 --- a/README_ch.md +++ b/README_ch.md @@ -67,6 +67,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 - 模型训练/评估 - [文本检测](./doc/doc_ch/detection.md) - [文本识别](./doc/doc_ch/recognition.md) + - [方向分类器](./doc/doc_ch/angle_class.md) - [yml参数配置文件介绍](./doc/doc_ch/config.md) - 预测部署 - [基于pip安装whl包快速推理](./doc/doc_ch/whl.md) @@ -75,7 +76,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 - [服务化部署](./deploy/hubserving/readme.md) - [端侧部署](./deploy/lite/readme.md) - [模型量化](./deploy/slim/quantization/README.md) - - [模型裁剪](./deploy/slim/prune/README_ch.md) + - [模型裁剪](./deploy/slim/prune/README.md) - [Benchmark](./doc/doc_ch/benchmark.md) - 数据集 - [通用中英文OCR数据集](./doc/doc_ch/datasets.md) diff --git a/configs/det/det_r18_vd_db.yml b/configs/det/det_r18_vd_db_v1.1.yml similarity index 78% rename from configs/det/det_r18_vd_db.yml rename to configs/det/det_r18_vd_db_v1.1.yml index ab313720..aa6dc0ee 100755 --- a/configs/det/det_r18_vd_db.yml +++ b/configs/det/det_r18_vd_db_v1.1.yml @@ -43,6 +43,10 @@ Optimizer: base_lr: 0.001 beta1: 0.9 beta2: 0.999 + decay: + function: cosine_decay_warmup + step_each_epoch: 32 + total_epoch: 1200 PostProcess: function: ppocr.postprocess.db_postprocess,DBPostProcess diff --git a/deploy/slim/prune/README.md b/deploy/slim/prune/README.md index 8ec5492c..ab731215 100644 --- a/deploy/slim/prune/README.md +++ b/deploy/slim/prune/README.md @@ -3,7 +3,8 @@ 复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型裁剪通过移出网络模型中的子模型来减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 -本教程将介绍如何使用PaddleSlim量化PaddleOCR的模型。 +本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。 +PaddleSlim(项目链接:https://github.com/PaddlePaddle/PaddleSlim)集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。 在开始本教程之前,建议先了解 1. [PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md) @@ -33,8 +34,20 @@ python setup.py install ### 3. 敏感度分析训练 -加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,了解各网络层冗余度,从而决定每个网络层的裁剪比例。 +加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,得到敏感度文件:sensitivities_0.data,可以通过PaddleSlim提供的[接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221)加载文件,获得各网络层在不同裁剪比例下的精度损失。从而了解各网络层冗余度,决定每个网络层的裁剪比例。 敏感度分析的具体细节见:[敏感度分析](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) +敏感度文件内容格式: + sensitivities_0.data(Dict){ + 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + 'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + } + + 例子: + { + 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} + 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} + } +加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) 进入PaddleOCR根目录,通过以下命令对模型进行敏感度分析训练: ```bash @@ -42,7 +55,7 @@ python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Gl ``` ### 4. 模型裁剪训练 -裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时,为了尽可能多的保留从图像中提取的低阶特征,我们跳过了backbone中靠近输入的4个卷积层。同样,为了减少由于裁剪导致的模型性能损失,我们通过之前敏感度分析所获得的敏感度表,挑选出了一些冗余较少,对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41),并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。 +裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时,为了尽可能多的保留从图像中提取的低阶特征,我们跳过了backbone中靠近输入的4个卷积层。同样,为了减少由于裁剪导致的模型性能损失,我们通过之前敏感度分析所获得的敏感度表,人工挑选出了一些冗余较少,对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41)(指在较低的裁剪比例下就导致很高性能损失的网络层),并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。 ```bash python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md index d854c107..fee0b12f 100644 --- a/deploy/slim/prune/README_en.md +++ b/deploy/slim/prune/README_en.md @@ -115,6 +115,7 @@ Compress results: Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance. This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model. +PaddleSlim (GitHub: https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry. It is recommended that you could understand following pages before reading this example,: @@ -146,7 +147,20 @@ python setup.py install ## Pruning sensitivity analysis - After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, thereby determining the pruning ratio of each network layer. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sensitivities_0.data. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + The data format of sensitivity file: + sensitivities_0.data(Dict){ + 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + 'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + } + + example: + { + 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} + 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} + } + The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of correspoding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) + Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command: diff --git a/deploy/slim/quantization/README.md b/deploy/slim/quantization/README.md index bf801d71..b35761c6 100755 --- a/deploy/slim/quantization/README.md +++ b/deploy/slim/quantization/README.md @@ -3,7 +3,8 @@ 复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 模型量化可以在基本不损失模型的精度的情况下,将FP32精度的模型参数转换为Int8精度,减小模型参数大小并加速计算,使用量化后的模型在移动端等部署时更具备速度优势。 -本教程将介绍如何使用PaddleSlim量化PaddleOCR的模型。 +本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。 +PaddleSlim(项目链接:https://github.com/PaddlePaddle/PaddleSlim)集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。 在开始本教程之前,建议先了解[PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html) diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md index 4b8a2b23..69bd603a 100755 --- a/deploy/slim/quantization/README_en.md +++ b/deploy/slim/quantization/README_en.md @@ -116,6 +116,7 @@ Compress results: Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance. This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model. +PaddleSlim (GitHub: https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry. It is recommended that you could understand following pages before reading this example,: diff --git a/doc/doc_ch/config.md b/doc/doc_ch/config.md index fe8db9c8..f9c664d4 100644 --- a/doc/doc_ch/config.md +++ b/doc/doc_ch/config.md @@ -10,7 +10,7 @@ ## 配置文件 Global 参数介绍 -以 `rec_chinese_lite_train.yml` 为例 +以 `rec_chinese_lite_train_v1.1.yml ` 为例 | 字段 | 用途 | 默认值 | 备注 | @@ -32,6 +32,7 @@ | loss_type | 设置 loss 类型 | ctc | 支持两种loss: ctc / attention | | distort | 设置是否使用数据增强 | false | 设置为true时,将在训练时随机进行扰动,支持的扰动操作可阅读[img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py) | | use_space_char | 设置是否识别空格 | false | 仅在 character_type=ch 时支持空格 | +| label_list | 设置方向分类器支持的角度 | ['0','180'] | 仅在方向分类器中生效 | | average_window | ModelAverage优化器中的窗口长度计算比例 | 0.15 | 目前仅应用与SRN | | max_average_window | 平均值计算窗口长度的最大值 | 15625 | 推荐设置为一轮训练中mini-batchs的数目| | min_average_window | 平均值计算窗口长度的最小值 | 10000 | \ | diff --git a/doc/doc_ch/detection.md b/doc/doc_ch/detection.md index c2b62edb..3945b7f0 100644 --- a/doc/doc_ch/detection.md +++ b/doc/doc_ch/detection.md @@ -44,13 +44,15 @@ json.dumps编码前的图像标注信息是包含多个字典的list,字典中 ## 快速启动训练 -首先下载模型backbone的pretrain model,PaddleOCR的检测模型目前支持两种backbone,分别是MobileNetV3、ResNet50_vd, +首先下载模型backbone的pretrain model,PaddleOCR的检测模型目前支持两种backbone,分别是MobileNetV3、ResNet_vd系列, 您可以根据需求使用[PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures)中的模型更换backbone。 ```shell cd PaddleOCR/ # 下载MobileNetV3的预训练模型 wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV3_large_x0_5_pretrained.tar -# 下载ResNet50的预训练模型 +# 或,下载ResNet18_vd的预训练模型 +wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_vd_pretrained.tar +# 或,下载ResNet50_vd的预训练模型 wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar # 解压预训练模型文件,以MobileNetV3为例 @@ -72,24 +74,24 @@ tar -xf ./pretrain_models/MobileNetV3_large_x0_5_pretrained.tar ./pretrain_model ```shell # 训练 mv3_db 模型,并将训练日志保存为 tain_det.log -python3 tools/train.py -c configs/det/det_mv3_db.yml \ +python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml \ -o Global.pretrain_weights=./pretrain_models/MobileNetV3_large_x0_5_pretrained/ \ 2>&1 | tee train_det.log ``` -上述指令中,通过-c 选择训练使用configs/det/det_db_mv3.yml配置文件。 +上述指令中,通过-c 选择训练使用configs/det/det_db_mv3_v1.1.yml配置文件。 有关配置文件的详细解释,请参考[链接](./config.md)。 您也可以通过-o参数在不需要修改yml文件的情况下,改变训练的参数,比如,调整训练的学习率为0.0001 ```shell -python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001 +python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Optimizer.base_lr=0.0001 ``` #### 断点训练 如果训练程序中断,如果希望加载训练中断的模型从而恢复训练,可以通过指定Global.checkpoints指定要加载的模型路径: ```shell -python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./your/trained/model +python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints=./your/trained/model ``` **注意**:`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。 @@ -98,17 +100,17 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./you PaddleOCR计算三个OCR检测相关的指标,分别是:Precision、Recall、Hmean。 -运行如下代码,根据配置文件`det_db_mv3.yml`中`save_res_path`指定的测试集检测结果文件,计算评估指标。 +运行如下代码,根据配置文件`det_db_mv3_v1.1.yml`中`save_res_path`指定的测试集检测结果文件,计算评估指标。 评估时设置后处理参数`box_thresh=0.6`,`unclip_ratio=1.5`,使用不同数据集、不同模型训练,可调整这两个参数进行优化 ```shell -python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 +python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 ``` 训练中模型参数默认保存在`Global.save_model_dir`目录下。在评估指标时,需要设置`Global.checkpoints`指向保存的参数文件。 比如: ```shell -python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 +python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 ``` * 注:`box_thresh`、`unclip_ratio`是DB后处理所需要的参数,在评估EAST模型时不需要设置 @@ -117,16 +119,16 @@ python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./ou 测试单张图像的检测效果 ```shell -python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" +python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" ``` 测试DB模型时,调整后处理阈值, ```shell -python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 +python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 ``` 测试文件夹下所有图像的检测效果 ```shell -python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy" +python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy" ``` diff --git a/doc/doc_ch/installation.md b/doc/doc_ch/installation.md index 381d1a9e..d4b0a67f 100644 --- a/doc/doc_ch/installation.md +++ b/doc/doc_ch/installation.md @@ -2,7 +2,7 @@ 经测试PaddleOCR可在glibc 2.23上运行,您也可以测试其他glibc版本或安装glic 2.23 PaddleOCR 工作环境 -- PaddlePaddle 1.7+ +- PaddlePaddle 1.8+ ,推荐使用 PaddlePaddle 2.0.0.beta - python3.7 - glibc 2.23 - cuDNN 7.6+ (GPU) @@ -47,19 +47,16 @@ docker images hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 ``` -**2. 安装PaddlePaddle Fluid v1.7** +**2. 安装PaddlePaddle Fluid v2.0** ``` pip3 install --upgrade pip -如果您的机器安装的是CUDA9,请运行以下命令安装 -python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple - -如果您的机器安装的是CUDA10,请运行以下命令安装 -python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple +如果您的机器安装的是CUDA9或CUDA10,请运行以下命令安装 +python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple 如果您的机器是CPU,请运行以下命令安装 -python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple +python3 -m pip install paddlepaddle==2.0.0b0 -i https://mirror.baidu.com/pypi/simple 更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。 ``` diff --git a/doc/doc_ch/recognition.md b/doc/doc_ch/recognition.md index 1df5ac3d..918050cc 100644 --- a/doc/doc_ch/recognition.md +++ b/doc/doc_ch/recognition.md @@ -259,6 +259,7 @@ PaddleOCR也提供了多语言的, `configs/rec/multi_languages` 路径下的 | rec_japan_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 日语 | | rec_korean_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | 韩语 | + 多语言模型训练方式与中文模型一致,训练数据集均为100w的合成数据,少量的字体可以在 [百度网盘](https://pan.baidu.com/s/1bS_u207Rm7YbY33wOECKDA) 上下载,提取码:frgi。 如您希望在现有模型效果的基础上调优,请参考下列说明修改配置文件: diff --git a/doc/doc_ch/whl.md b/doc/doc_ch/whl.md index 46796ce6..1b04a9a8 100644 --- a/doc/doc_ch/whl.md +++ b/doc/doc_ch/whl.md @@ -11,8 +11,8 @@ pip install paddleocr 本地构建并安装 ```bash -python setup.py bdist_wheel -pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号 +python3 setup.py bdist_wheel +pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号 ``` ### 1. 代码使用 @@ -20,7 +20,7 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本 ```python from paddleocr import PaddleOCR, draw_ocr # Paddleocr目前支持中英文、英文、法语、德语、韩语、日语,可以通过修改lang参数进行切换 -# 参数依次为`zh`, `en`, `french`, `german`, `korean`, `japan`。 +# 参数依次为`ch`, `en`, `french`, `german`, `korean`, `japan`。 ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs/11.jpg' result = ocr.ocr(img_path, cls=True) @@ -280,7 +280,7 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_ | rec_algorithm | 使用的识别算法类型 | CRNN | | rec_model_dir | 识别模型所在文件夹。传参方式有两种,1. None: 自动下载内置模型到 `~/.paddleocr/rec`;2.自己转换好的inference模型路径,模型路径下必须包含model和params文件 | None | | rec_image_shape | 识别算法的输入图片尺寸 | "3,32,320" | -| rec_char_type | 识别算法的字符类型,中文(ch)或英文(en) | ch | +| rec_char_type | 识别算法的字符类型,中英文(ch)、英文(en)、法语(french)、德语(german)、韩语(korean)、日语(japan) | ch | | rec_batch_num | 进行识别时,同时前向的图片数 | 30 | | max_text_length | 识别算法能识别的最大文字长度 | 25 | | rec_char_dict_path | 识别模型字典路径,当rec_model_dir使用方式2传参时需要修改为自己的字典路径 | ./ppocr/utils/ppocr_keys_v1.txt | diff --git a/doc/doc_en/config_en.md b/doc/doc_en/config_en.md index b54def89..722da662 100644 --- a/doc/doc_en/config_en.md +++ b/doc/doc_en/config_en.md @@ -10,7 +10,7 @@ The following list can be viewed via `--help` ## INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE -Take `rec_chinese_lite_train.yml` as an example +Take `rec_chinese_lite_train_v1.1.yml` as an example | Parameter | Use | Default | Note | @@ -32,6 +32,7 @@ Take `rec_chinese_lite_train.yml` as an example | loss_type | Set loss type | ctc | Supports two types of loss: ctc / attention | | distort | Set use distort | false | Support distort type ,read [img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py) | | use_space_char | Wether to recognize space | false | Only support in character_type=ch mode | + label_list | Set the angle supported by the direction classifier | ['0','180'] | Only valid in the direction classifier | | reader_yml | Set the reader configuration file | ./configs/rec/rec_icdar15_reader.yml | \ | | pretrain_weights | Load pre-trained model path | ./pretrain_models/CRNN/best_accuracy | \ | | checkpoints | Load saved model path | None | Used to load saved parameters to continue training after interruption | diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md index 401d7a9a..96928364 100644 --- a/doc/doc_en/detection_en.md +++ b/doc/doc_en/detection_en.md @@ -38,12 +38,14 @@ If you want to train PaddleOCR on other datasets, please build the annotation fi ## TRAINING -First download the pretrained model. The detection model of PaddleOCR currently supports two backbones, namely MobileNetV3 and ResNet50_vd. You can use the model in [PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures) to replace backbone according to your needs. +First download the pretrained model. The detection model of PaddleOCR currently supports 3 backbones, namely MobileNetV3, ResNet18_vd and ResNet50_vd. You can use the model in [PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures) to replace backbone according to your needs. ```shell cd PaddleOCR/ # Download the pre-trained model of MobileNetV3 wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV3_large_x0_5_pretrained.tar -# Download the pre-trained model of ResNet50 +# or, download the pre-trained model of ResNet18_vd +wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_vd_pretrained.tar +# or, download the pre-trained model of ResNet50_vd wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar # decompressing the pre-training model file, take MobileNetV3 as an example @@ -62,7 +64,7 @@ tar -xf ./pretrain_models/MobileNetV3_large_x0_5_pretrained.tar ./pretrain_model #### START TRAINING *If CPU version installed, please set the parameter `use_gpu` to `false` in the configuration.* ```shell -python3 tools/train.py -c configs/det/det_mv3_db.yml 2>&1 | tee train_det.log +python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml 2>&1 | tee train_det.log ``` In the above instruction, use `-c` to select the training to use the `configs/det/det_db_mv3.yml` configuration file. @@ -70,7 +72,7 @@ For a detailed explanation of the configuration file, please refer to [config](. You can also use `-o` to change the training parameters without modifying the yml file. For example, adjust the training learning rate to 0.0001 ```shell -python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001 +python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Optimizer.base_lr=0.0001 ``` #### load trained model and continue training @@ -78,7 +80,7 @@ If you expect to load trained model and continue the training again, you can spe For example: ```shell -python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./your/trained/model +python3 tools/train.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints=./your/trained/model ``` **Note**: The priority of `Global.checkpoints` is higher than that of `Global.pretrain_weights`, that is, when two parameters are specified at the same time, the model specified by `Global.checkpoints` will be loaded first. If the model path specified by `Global.checkpoints` is wrong, the one specified by `Global.pretrain_weights` will be loaded. @@ -88,18 +90,18 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./you PaddleOCR calculates three indicators for evaluating performance of OCR detection task: Precision, Recall, and Hmean. -Run the following code to calculate the evaluation indicators. The result will be saved in the test result file specified by `save_res_path` in the configuration file `det_db_mv3.yml` +Run the following code to calculate the evaluation indicators. The result will be saved in the test result file specified by `save_res_path` in the configuration file `det_db_mv3_v1.1.yml` When evaluating, set post-processing parameters `box_thresh=0.6`, `unclip_ratio=1.5`. If you use different datasets, different models for training, these two parameters should be adjusted for better result. ```shell -python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 +python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 ``` The model parameters during training are saved in the `Global.save_model_dir` directory by default. When evaluating indicators, you need to set `Global.checkpoints` to point to the saved parameter file. Such as: ```shell -python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 +python3 tools/eval.py -c configs/det/det_mv3_db_v1.1.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 ``` * Note: `box_thresh` and `unclip_ratio` are parameters required for DB post-processing, and not need to be set when evaluating the EAST model. @@ -108,16 +110,16 @@ python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./ou Test the detection result on a single image: ```shell -python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" +python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" ``` When testing the DB model, adjust the post-processing threshold: ```shell -python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 +python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5 ``` Test the detection result on all images in the folder: ```shell -python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy" +python3 tools/infer_det.py -c configs/det/det_mv3_db_v1.1.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy" ``` diff --git a/doc/doc_en/installation_en.md b/doc/doc_en/installation_en.md index b62d9b29..7608e12d 100644 --- a/doc/doc_en/installation_en.md +++ b/doc/doc_en/installation_en.md @@ -3,7 +3,7 @@ After testing, paddleocr can run on glibc 2.23. You can also test other glibc versions or install glic 2.23 for the best compatibility. PaddleOCR working environment: -- PaddlePaddle1.7 +- PaddlePaddle1.8+, Recommend PaddlePaddle 2.0.0.beta - python3.7 - glibc 2.23 @@ -49,18 +49,15 @@ docker images hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 ``` -**2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress)** +**2. Install PaddlePaddle Fluid v2.0 ** ``` pip3 install --upgrade pip -# If you have cuda9 installed on your machine, please run the following command to install -python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple - -# If you have cuda10 installed on your machine, please run the following command to install -python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple +# If you have cuda9 or cuda10 installed on your machine, please run the following command to install +python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple # If you only have cpu on your machine, please run the following command to install -python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple +python3 -m pip install paddlepaddle==2.0.0b0 -i https://mirror.baidu.com/pypi/simple ``` For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation. diff --git a/doc/doc_en/whl_en.md b/doc/doc_en/whl_en.md index 4049d9dc..ffbced34 100644 --- a/doc/doc_en/whl_en.md +++ b/doc/doc_en/whl_en.md @@ -9,8 +9,8 @@ pip install paddleocr build own whl package and install ```bash -python setup.py bdist_wheel -pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr +python3 setup.py bdist_wheel +pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr ``` ### 1. Use by code @@ -18,7 +18,7 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of padd ```python from paddleocr import PaddleOCR,draw_ocr # Paddleocr supports Chinese, English, French, German, Korean and Japanese. -# You can set the parameter `lang` as `zh`, `en`, `french`, `german`, `korean`, `japan` +# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan` # to switch the language model in order. ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' @@ -302,7 +302,7 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_ | cls_batch_num | When performing classification, the batchsize of forward images | 30 | | enable_mkldnn | Whether to enable mkldnn | FALSE | | use_zero_copy_run | Whether to forward by zero_copy_run | FALSE | -| lang | The support language, now only chinese(ch) and english(en) are supported | ch | +| lang | The support language, now only Chinese(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan) are supported | ch | | det | Enable detction when `ppocr.ocr` func exec | TRUE | | rec | Enable recognition when `ppocr.ocr` func exec | TRUE | | cls | Enable classification when `ppocr.ocr` func exec | FALSE |