Merge remote-tracking branch 'upstream/dygraph' into dy3

2020-12-17 15:18:20 +08:00 · 2020-12-17 15:18:20 +08:00 · 246a0bce7d
parent 4779e00ac1 84b4323abf
commit 246a0bce7d
51 changed files with 155 additions and 79 deletions
--- a/README_ch.md
+++ b/README_ch.md
@ -9,7 +9,7 @@ PaddleOCR同时支持动态图与静态图两种编程范式

 **近期更新**
 - 2020.12.15 更新数据合成工具[Style-Text](./StyleText/README_ch.md)，可以批量合成大量与目标场景类似的图像，在多个场景验证，效果明显提升。
- 2020.12.07 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题，总数124个，并且计划以后每周一都会更新，欢迎大家持续关注。
+- 2020.12.14 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题，总数127个，每周一都会更新，欢迎大家持续关注。
 - 2020.11.25 更新半自动标注工具[PPOCRLabel](./PPOCRLabel/README_ch.md)，辅助开发者高效完成标注任务，输出格式与PP-OCR训练任务完美衔接。
 - 2020.9.22 更新PP-OCR技术文章，https://arxiv.org/abs/2009.09941
 - [More](./doc/doc_ch/update.md)
@ -39,6 +39,14 @@ PaddleOCR同时支持动态图与静态图两种编程范式

 上图是通用ppocr_server模型效果展示，更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)。

+<a name="欢迎加入PaddleOCR技术交流群"></a>
+## 欢迎加入PaddleOCR技术交流群
+- 微信扫描二维码加入官方交流群，获得更高效的问题答疑，与各行各业开发者充分交流，期待您的加入。
+
+<div align="center">
+<img src="./doc/joinus.PNG"  width = "200" height = "200" />
+</div>
+
 ## 快速体验
 - PC端：超轻量级中文OCR在线体验地址：https://www.paddlepaddle.org.cn/hub/scene/ocr

@ -121,7 +129,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框

 - 英文模型
 <div align="center">
-    <img src="./doc/imgs_results/img_12.jpg" width="800">
+    <img src="./doc/imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
 </div>

 - 其他语言模型
@ -130,13 +138,6 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
    <img src="./doc/imgs_results/korean.jpg" width="800">
 </div>

-<a name="欢迎加入PaddleOCR技术交流群"></a>
-## 欢迎加入PaddleOCR技术交流群
-请扫描下面二维码，完成问卷填写，获取加群二维码和OCR方向的炼丹秘籍
-
-<div align="center">
-<img src="./doc/joinus.PNG"  width = "200" height = "200" />
-</div>

 <a name="许可证书"></a>
 ## 许可证书
--- a/StyleText/README.md
+++ b/StyleText/README.md
@ -72,7 +72,10 @@ fusion_generator:
 python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_images/2.jpg --text_corpus PaddleOCR --language en
 ```

-* Note: The language options is correspond to the corpus. Currently, the tool only supports English, Simplified Chinese and Korean.
+* Note 1: The language options is correspond to the corpus. Currently, the tool only supports English, Simplified Chinese and Korean.
+* Note 2: Synth-Text is mainly used to generate images for OCR recognition models. 
+  So the height of style images should be around 32 pixels. Images in other sizes may behave poorly.
+

 For example, enter the following image and corpus `PaddleOCR`.

@ -116,9 +119,17 @@ In actual application scenarios, it is often necessary to synthesize pictures in
   * `CorpusGenerator`：
     * `method`：Method of CorpusGenerator，supports `FileCorpus` and `EnNumCorpus`. If `EnNumCorpus` is used，No other configuration is needed，otherwise you need to set `corpus_file` and `language`.
     * `language`：Language of the corpus.
-     * `corpus_file`: Filepath of the corpus.
+     * `corpus_file`: Filepath of the corpus. Corpus file should be a text file which will be split by line-endings（'\n'）. Corpus generator samples one line each time.


+Example of corpus file: 
+```
+PaddleOCR
+飞桨文字识别
+StyleText
+风格文本图像数据合成
+```
+
 We provide a general dataset containing Chinese, English and Korean (50,000 images in all) for your trial ([download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/style_text/chkoen_5w.tar)), some examples are given below :

 <div align="center">
@ -130,7 +141,18 @@ We provide a general dataset containing Chinese, English and Korean (50,000 imag
   ``` bash
   python -m tools.synth_dataset.py -c configs/dataset_config.yml
   ```
-
+We also provide example corpus and images in `examples` folder. 
+    <div align="center">
+        <img src="examples/style_images/1.jpg" width="300">
+        <img src="examples/style_images/2.jpg" width="300">
+    </div>
+If you run the code above directly, you will get example output data in `output_data` folder.
+You will get synthesis images and labels as below:
+   <div align="center">
+       <img src="doc/images/12.png" width="800">
+   </div>
+There will be some cache under the `label` folder. If the program exit unexpectedly, you can find cached labels there.
+When the program finish normally, you will find all the labels in `label.txt` which give the final results.

 <a name="Applications"></a>
 ### Applications
--- a/StyleText/README_ch.md
+++ b/StyleText/README_ch.md
@ -63,7 +63,10 @@ fusion_generator:
 ```python
 python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_images/2.jpg --text_corpus PaddleOCR --language en
 ```
-* 注意：语言选项和语料相对应，目前该工具只支持英文、简体中文和韩语。
+* 注1：语言选项和语料相对应，目前该工具只支持英文、简体中文和韩语。
+* 注2：Style-Text生成的数据主要应用于OCR识别场景。基于当前PaddleOCR识别模型的设计，我们主要支持高度在32左右的风格图像。
+  如果输入图像尺寸相差过多，效果可能不佳。
+

 例如，输入如下图片和语料"PaddleOCR":

@ -102,7 +105,16 @@ python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_
   * `CorpusGenerator`：
     * `method`：语料生成方法，目前有`FileCorpus`和`EnNumCorpus`可选。如果使用`EnNumCorpus`，则不需要填写其他配置，否则需要修改`corpus_file`和`language`；
     * `language`：语料的语种；
-     * `corpus_file`: 语料文件路径。
+     * `corpus_file`: 语料文件路径。语料文件应使用文本文件。语料生成器首先会将语料按行切分，之后每次随机选取一行。
+
+   语料文件格式示例：
+   ```
+   PaddleOCR
+   飞桨文字识别
+   StyleText
+   风格文本图像数据合成
+   ...
+   ```

   Style-Text也提供了一批中英韩5万张通用场景数据用作文本风格图像，便于合成场景丰富的文本图像，下图给出了一些示例。

@ -117,6 +129,19 @@ python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_
   ``` bash
   python -m tools.synth_dataset -c configs/dataset_config.yml
   ```
+   我们在examples目录下提供了样例图片和语料。
+    <div align="center">
+        <img src="examples/style_images/1.jpg" width="300">
+        <img src="examples/style_images/2.jpg" width="300">
+    </div>
+
+   直接运行上述命令，可以在output_data中产生样例输出，包括图片和用于训练识别模型的标注文件：
+   <div align="center">
+       <img src="doc/images/12.png" width="800">
+   </div>
+
+   其中label目录下的标注文件为程序运行过程中产生的缓存，如果程序在中途异常终止，可以使用缓存的标注文件。
+   如果程序正常运行完毕，则会在output_data下生成label.txt，为最终的标注结果。

 <a name="应用案例"></a>
 ### 四、应用案例
--- a/StyleText/configs/dataset_config.yml
+++ b/StyleText/configs/dataset_config.yml
@ -33,7 +33,7 @@ Predictor:
  - 0.5
  expand_result: false
  bg_generator:
-    pretrain: models/style_text_rec/bg_generator
+    pretrain: style_text_models/bg_generator
    module_name: bg_generator
    generator_type: BgGeneratorWithMask
    encode_dim: 64
@ -43,7 +43,7 @@ Predictor:
    conv_block_dilation: true
    output_factor: 1.05
  text_generator:
-    pretrain: models/style_text_rec/text_generator
+    pretrain: style_text_models/text_generator
    module_name: text_generator
    generator_type: TextGenerator
    encode_dim: 64
@ -52,7 +52,7 @@ Predictor:
    conv_block_dropout: false
    conv_block_dilation: true
  fusion_generator:
-    pretrain: models/style_text_rec/fusion_generator
+    pretrain: style_text_models/fusion_generator
    module_name: fusion_generator
    generator_type: FusionGeneratorSimple
    encode_dim: 64
--- a/StyleText/doc/images/12.png
+++ b/StyleText/doc/images/12.png
--- a/StyleText/examples/corpus/example.txt
+++ b/StyleText/examples/corpus/example.txt
@ -1,2 +1,2 @@
-PaddleOCR
+Paddle
 飞桨文字识别
--- a/configs/det/det_mv3_db.yml
+++ b/configs/det/det_mv3_db.yml
@ -2,11 +2,11 @@ Global:
  use_gpu: true
  epoch_num: 1200
  log_smooth_window: 20
-  print_batch_step: 2
+  print_batch_step: 10
  save_model_dir: ./output/db_mv3/
  save_epoch_step: 1200
-  # evaluation is run every 5000 iterations after the 4000th iteration
-  eval_batch_step: [4000, 5000]
+  # evaluation is run every 2000 iterations
+  eval_batch_step: [0, 2000]
  # if pretrained_model is saved in static mode, load_static_weights must set to True
  load_static_weights: True
  cal_metric_during_train: False
@ -39,7 +39,7 @@ Loss:
  alpha: 5
  beta: 10
  ohem_ratio: 3
-  
+
 Optimizer:
  name: Adam
  beta1: 0.9
@ -100,7 +100,7 @@ Train:
  loader:
    shuffle: True
    drop_last: False
-    batch_size_per_card: 4
+    batch_size_per_card: 16
    num_workers: 8

 Eval:
@ -128,4 +128,4 @@ Eval:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1 # must be 1
-    num_workers: 2
+    num_workers: 8
--- a/configs/det/det_r50_vd_db.yml
+++ b/configs/det/det_r50_vd_db.yml
@ -5,8 +5,8 @@ Global:
  print_batch_step: 10
  save_model_dir: ./output/det_r50_vd/
  save_epoch_step: 1200
-  # evaluation is run every 5000 iterations after the 4000th iteration
-  eval_batch_step: [5000,4000]
+  # evaluation is run every 2000 iterations
+  eval_batch_step: [0,2000]
  # if pretrained_model is saved in static mode, load_static_weights must set to True
  load_static_weights: True
  cal_metric_during_train: False
--- a/configs/det/det_r50_vd_sast_totaltext.yml
+++ b/configs/det/det_r50_vd_sast_totaltext.yml
@ -60,7 +60,8 @@ Metric:
 Train:
  dataset:
    name: SimpleDataSet
-    label_file_path: [./train_data/art_latin_icdar_14pt/train_no_tt_test/train_label_json.txt, ./train_data/total_text_icdar_14pt/train_label_json.txt]
+    data_dir: ./train_data/
+    label_file_list: [./train_data/art_latin_icdar_14pt/train_no_tt_test/train_label_json.txt, ./train_data/total_text_icdar_14pt/train_label_json.txt]
    data_ratio_list: [0.5, 0.5]
    transforms:
      - DecodeImage: # load image
--- a/deploy/cpp_infer/readme.md
+++ b/deploy/cpp_infer/readme.md
@ -103,17 +103,17 @@ make inference_lib_dist
 更多编译参数选项可以参考Paddle C++预测库官网：[https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。


-* 编译完成之后，可以在`build/fluid_inference_install_dir/`文件下看到生成了以下文件及文件夹。
+* 编译完成之后，可以在`build/paddle_inference_install_dir/`文件下看到生成了以下文件及文件夹。

 ```
-build/fluid_inference_install_dir/
+build/paddle_inference_install_dir/
 |-- CMakeCache.txt
 |-- paddle
 |-- third_party
 |-- version.txt
 ```

-其中`paddle`就是之后进行C++预测时所需的Paddle库，`version.txt`中包含当前预测库的版本信息。
+其中`paddle`就是C++预测所需的Paddle库，`version.txt`中包含当前预测库的版本信息。

 #### 1.2.2 直接下载安装

--- a/deploy/cpp_infer/tools/config.txt
+++ b/deploy/cpp_infer/tools/config.txt
@ -11,10 +11,15 @@ max_side_len  960
 det_db_thresh  0.3
 det_db_box_thresh  0.5
 det_db_unclip_ratio  2.0
-det_model_dir  ./inference/det_db
+det_model_dir  ./inference/ch__ppocr_mobile_v2.0_det_infer/
+
+# cls config
+use_angle_cls 0
+cls_model_dir  ./inference/ch_ppocr_mobile_v2.0_cls_infer/
+cls_thresh  0.9

 # rec config
-rec_model_dir  ./inference/rec_crnn
+rec_model_dir  ./inference/ch_ppocr_mobile_v2.0_rec_infer/
 char_list_file ../../ppocr/utils/ppocr_keys_v1.txt

 # show the detection results
--- a/doc/doc_ch/FAQ.md
+++ b/doc/doc_ch/FAQ.md
@ -9,44 +9,42 @@

 ## PaddleOCR常见问题汇总(持续更新)

-* [近期更新（2020.12.07）](#近期更新)
+* [近期更新（2020.12.14）](#近期更新)
 * [【精选】OCR精选10个问题](#OCR精选10个问题)
 * [【理论篇】OCR通用30个问题](#OCR通用问题)
  * [基础知识7题](#基础知识)
  * [数据集7题](#数据集2)
  * [模型训练调优7题](#模型训练调优2)
  * [预测部署9题](#预测部署2)
-* [【实战篇】PaddleOCR实战84个问题](#PaddleOCR实战问题)
-  * [使用咨询20题](#使用咨询)
+* [【实战篇】PaddleOCR实战87个问题](#PaddleOCR实战问题)
+  * [使用咨询21题](#使用咨询)
  * [数据集17题](#数据集3)
-  * [模型训练调优24题](#模型训练调优3)
-  * [预测部署23题](#预测部署3)
+  * [模型训练调优25题](#模型训练调优3)
+  * [预测部署24题](#预测部署3)


 <a name="近期更新"></a>
-## 近期更新（2020.12.07）
+## 近期更新（2020.12.14）

-#### Q2.4.9：弯曲文本有试过opencv的TPS进行弯曲校正吗？
+#### Q3.1.21：PaddleOCR支持动态图吗？

-**A**：opencv的tps需要标出上下边界对应的点，这些点很难通过传统方法或者深度学习方法获取。PaddleOCR里StarNet网络中的tps模块实现了自动学点，自动校正，可以直接尝试这个。
+**A**：动态图版本正在紧锣密鼓开发中，将于2020年12月16日发布，敬请关注。

-#### Q3.3.20: 文字检测时怎么模糊的数据增强？
+#### Q3.3.23：检测模型训练或预测时出现elementwise_add报错

-**A**: 模糊的数据增强需要修改代码进行添加，以DB为例，参考[Normalize](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppocr/data/imaug/operators.py#L60) ,添加模糊的增强就行 
+**A**：设置的输入尺寸必须是32的倍数，否则在网络多次下采样和上采样后，feature map会产生1个像素的diff，从而导致elementwise_add时报shape不匹配的错误。

-#### Q3.3.21: 文字检测时怎么更改图片旋转的角度，实现360度任意旋转？
+#### Q3.3.24: DB检测训练输入尺寸640，可以改大一些吗？

-**A**: 将[这里](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppocr/data/imaug/iaa_augment.py#L64) 的(-10,10) 改为(-180,180)即可 
+**A**: 不建议改大。检测模型训练输入尺寸是预处理中random crop后的尺寸，并非直接将原图进行resize，多数场景下这个尺寸并不小了，改大后可能反而并不合适，而且训练会变慢。另外，代码里可能有的地方参数按照预设输入尺寸适配的，改大后可能有隐藏风险。

-#### Q3.3.22: 训练数据的长宽比过大怎么修改shape
+#### Q3.3.25: 识别模型训练时，loss能正常下降，但acc一直为0

-**A**: 识别修改[这里](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yaml#L75) ,
-检测修改[这里](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml#L85)
+**A**: 识别模型训练初期acc为0是正常的，多训一段时间指标就上来了。

+#### Q3.4.24：DB模型能正确推理预测，但换成EAST或SAST模型时报错或结果不正确

-#### Q3.4.23：安装paddleocr后，提示没有paddle
-
-**A**：这是因为paddlepaddle gpu版本和cpu版本的名称不一致，现在已经在[whl的文档](./whl.md)里做了安装说明。
+**A**：使用EAST或SAST模型进行推理预测时，需要在命令中指定参数--det_algorithm="EAST" 或 --det_algorithm="SAST"，使用DB时不用指定是因为该参数默认值是"DB"：https://github.com/PaddlePaddle/PaddleOCR/blob/e7a708e9fdaf413ed7a14da8e4a7b4ac0b211e42/tools/infer/utility.py#L43

 <a name="OCR精选10个问题"></a>
 ## 【精选】OCR精选10个问题
@ -390,6 +388,10 @@
 **A**：PaddleOCR主要聚焦通用ocr，如果有垂类需求，您可以用PaddleOCR+垂类数据自己训练；
 如果缺少带标注的数据，或者不想投入研发成本，建议直接调用开放的API，开放的API覆盖了目前比较常见的一些垂类。

+#### Q3.1.21：PaddleOCR支持动态图吗？
+
+**A**：动态图版本正在紧锣密鼓开发中，将于2020年12月16日发布，敬请关注。
+
 <a name="数据集3"></a>
 ### 数据集

@ -603,6 +605,18 @@ ps -axu | grep train.py | awk '{print $2}' | xargs kill -9
 **A**: 识别修改[这里](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yaml#L75) ,
 检测修改[这里](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml#L85)

+#### Q3.3.23：检测模型训练或预测时出现elementwise_add报错
+
+**A**：设置的输入尺寸必须是32的倍数，否则在网络多次下采样和上采样后，feature map会产生1个像素的diff，从而导致elementwise_add时报shape不匹配的错误。
+
+#### Q3.3.24: DB检测训练输入尺寸640，可以改大一些吗？
+
+**A**: 不建议改大。检测模型训练输入尺寸是预处理中random crop后的尺寸，并非直接将原图进行resize，多数场景下这个尺寸并不小了，改大后可能反而并不合适，而且训练会变慢。另外，代码里可能有的地方参数按照预设输入尺寸适配的，改大后可能有隐藏风险。
+
+#### Q3.3.25: 识别模型训练时，loss能正常下降，但acc一直为0
+
+**A**: 识别模型训练初期acc为0是正常的，多训一段时间指标就上来了。
+
 <a name="预测部署3"></a>

 ### 预测部署
@ -710,4 +724,8 @@ ps -axu | grep train.py | awk '{print $2}' | xargs kill -9

 #### Q3.4.23：安装paddleocr后，提示没有paddle

-**A**：这是因为paddlepaddle gpu版本和cpu版本的名称不一致，现在已经在[whl的文档](./whl.md)里做了安装说明。
+**A**：这是因为paddlepaddle gpu版本和cpu版本的名称不一致，现在已经在[whl的文档](./whl.md)里做了安装说明。
+
+#### Q3.4.24：DB模型能正确推理预测，但换成EAST或SAST模型时报错或结果不正确
+
+**A**：使用EAST或SAST模型进行推理预测时，需要在命令中指定参数--det_algorithm="EAST" 或 --det_algorithm="SAST"，使用DB时不用指定是因为该参数默认值是"DB"：https://github.com/PaddlePaddle/PaddleOCR/blob/e7a708e9fdaf413ed7a14da8e4a7b4ac0b211e42/tools/infer/utility.py#L43
--- a/doc/doc_ch/inference.md
+++ b/doc/doc_ch/inference.md
@ -131,12 +131,12 @@ python3 tools/export_model.py -c configs/cls/cls_mv3.yml -o Global.pretrained_mo
 # 下载超轻量中文检测模型：
 wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
 tar xf ch_ppocr_mobile_v2.0_det_infer.tar
-python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./ch_ppocr_mobile_v2.0_det_infer/"
+python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./ch_ppocr_mobile_v2.0_det_infer/"
 ```

 可视化文本检测结果默认保存到`./inference_results`文件夹里面，结果文件的名称前缀为'det_res'。结果示例如下：

-![](../imgs_results/det_res_22.jpg)
+![](../imgs_results/det_res_00018069.jpg)

 通过参数`limit_type`和`det_limit_side_len`来对图片的尺寸进行限制，
 `litmit_type`可选参数为[`max`, `min`]，
--- a/doc/doc_ch/installation.md
+++ b/doc/doc_ch/installation.md
@ -58,7 +58,7 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
 **4. 安装第三方库**
 ```
 cd PaddleOCR
-pip3 install -r requirments.txt
+pip3 install -r requirements.txt
 ```

 注意，windows环境下，建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装，
--- a/doc/doc_ch/tree.md
+++ b/doc/doc_ch/tree.md
@ -115,7 +115,7 @@ PaddleOCR
 │   │   │   ├── text_image_aug              // 文本识别的 tia 数据扩充
 │   │   │   │   ├── __init__.py
 │   │   │   │   ├── augment.py              // tia_distort,tia_stretch 和 tia_perspective 的代码
-│   │   │   │   ├── warp_mls.py 
+│   │   │   │   ├── warp_mls.py
 │   │   │   ├── __init__.py
 │   │   │   ├── east_process.py             // EAST 算法的数据处理步骤
 │   │   │   ├── make_border_map.py          // 生成边界图
@ -167,7 +167,7 @@ PaddleOCR
 │   │   │   ├── det_east_head.py            // EAST 检测头
 │   │   │   ├── det_sast_head.py            // SAST 检测头
 │   │   │   ├── rec_ctc_head.py             // 识别 ctc
-│   │   │   ├── rec_att_head.py             // 识别 attention 
+│   │   │   ├── rec_att_head.py             // 识别 attention
 │   │   ├── transforms                      // 图像变换
 │   │   │   ├── __init__.py                 // 构造 transform 相关代码
 │   │   │   └── tps.py                      // TPS 变换
@ -185,7 +185,7 @@ PaddleOCR
 │   │   └── sast_postprocess.py             // SAST 后处理
 │   └── utils                               // 工具
 │       ├── dict                            // 小语种字典
-│            ....                            
+│            ....  
 │       ├── ic15_dict.txt                   // 英文数字字典，区分大小写
 │       ├── ppocr_keys_v1.txt               // 中文字典，用于训练中文模型
 │       ├── logging.py                      // logger
@ -207,10 +207,10 @@ PaddleOCR
 │   ├── program.py                          // 整体流程
 │   ├── test_hubserving.py
 │   └── train.py                            // 启动训练
-├── paddleocr.py 
+├── paddleocr.py
 ├── README_ch.md                            // 中文说明文档
 ├── README_en.md                            // 英文说明文档
 ├── README.md                               // 主页说明文档
-├── requirments.txt                         // 安装依赖
+├── requirements.txt                         // 安装依赖
 ├── setup.py                                // whl包打包脚本
-├── train.sh                                // 启动训练脚本
+├── train.sh                                // 启动训练脚本
--- a/doc/doc_en/inference_en.md
+++ b/doc/doc_en/inference_en.md
@ -138,12 +138,12 @@ For lightweight Chinese detection model inference, you can execute the following
 wget  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
 tar xf ch_ppocr_mobile_v2.0_det_infer.tar
 # predict
-python3 tools/infer/predict_det.py --image_dir="./doc/imgs/22.jpg" --det_model_dir="./inference/det_db/"
+python3 tools/infer/predict_det.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/"
 ```

 The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:

-![](../imgs_results/det_res_22.jpg)
+![](../imgs_results/det_res_00018069.jpg)

 You can use the parameters `limit_type` and `det_limit_side_len` to limit the size of the input image,
 The optional parameters of `litmit_type` are [`max`, `min`], and
--- a/doc/doc_en/installation_en.md
+++ b/doc/doc_en/installation_en.md
@ -61,7 +61,7 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
 **4. Install third-party libraries**
 ```
 cd PaddleOCR
-pip3 install -r requirments.txt
+pip3 install -r requirements.txt
 ```

 If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows.
--- a/doc/doc_en/tree_en.md
+++ b/doc/doc_en/tree_en.md
@ -116,7 +116,7 @@ PaddleOCR
 │   │   │   ├── text_image_aug              // Tia data augment for text recognition
 │   │   │   │   ├── __init__.py
 │   │   │   │   ├── augment.py              // Tia_distort,tia_stretch and tia_perspective
-│   │   │   │   ├── warp_mls.py 
+│   │   │   │   ├── warp_mls.py
 │   │   │   ├── __init__.py
 │   │   │   ├── east_process.py             // Data processing steps of EAST algorithm
 │   │   │   ├── iaa_augment.py              // Data augmentation operations
@ -188,7 +188,7 @@ PaddleOCR
 │   │   └── sast_postprocess.py             // SAST post-processing
 │   └── utils                               // utils
 │       ├── dict                            // Minor language dictionary
-│            ....                            
+│            ....  
 │       ├── ic15_dict.txt                   // English number dictionary, case sensitive
 │       ├── ppocr_keys_v1.txt               // Chinese dictionary for training Chinese models
 │       ├── logging.py                      // logger
@ -210,10 +210,10 @@ PaddleOCR
 │   ├── program.py                          // Inference system
 │   ├── test_hubserving.py
 │   └── train.py                            // Start training script
-├── paddleocr.py 
+├── paddleocr.py
 ├── README_ch.md                            // Chinese documentation
 ├── README_en.md                            // English documentation
 ├── README.md                               // Home page documentation
-├── requirments.txt                         // Requirments
+├── requirements.txt                         // Requirements
 ├── setup.py                                // Whl package packaging script
-├── train.sh                                // Start training bash script
+├── train.sh                                // Start training bash script
--- a/doc/imgs/00006737.jpg
+++ b/doc/imgs/00006737.jpg
--- a/doc/imgs/00009282.jpg
+++ b/doc/imgs/00009282.jpg
--- a/doc/imgs/00015504.jpg
+++ b/doc/imgs/00015504.jpg
--- a/doc/imgs/00018069.jpg
+++ b/doc/imgs/00018069.jpg
--- a/doc/imgs/00056221.jpg
+++ b/doc/imgs/00056221.jpg
--- a/doc/imgs/00057937.jpg
+++ b/doc/imgs/00057937.jpg
--- a/doc/imgs/00059985.jpg
+++ b/doc/imgs/00059985.jpg
--- a/doc/imgs/00077949.jpg
+++ b/doc/imgs/00077949.jpg
--- a/doc/imgs/00111002.jpg
+++ b/doc/imgs/00111002.jpg
--- a/doc/imgs/00207393.jpg
+++ b/doc/imgs/00207393.jpg
--- a/doc/imgs/10.jpg
+++ b/doc/imgs/10.jpg
--- a/doc/imgs/13.png
+++ b/doc/imgs/13.png
--- a/doc/imgs/15.jpg
+++ b/doc/imgs/15.jpg
--- a/doc/imgs/16.png
+++ b/doc/imgs/16.png
--- a/doc/imgs/17.png
+++ b/doc/imgs/17.png
--- a/doc/imgs/2.jpg
+++ b/doc/imgs/2.jpg
--- a/doc/imgs/22.jpg
+++ b/doc/imgs/22.jpg
--- a/doc/imgs/3.jpg
+++ b/doc/imgs/3.jpg
--- a/doc/imgs/4.jpg
+++ b/doc/imgs/4.jpg
--- a/doc/imgs/5.jpg
+++ b/doc/imgs/5.jpg
--- a/doc/imgs/6.jpg
+++ b/doc/imgs/6.jpg
--- a/doc/imgs/7.jpg
+++ b/doc/imgs/7.jpg
--- a/doc/imgs/8.jpg
+++ b/doc/imgs/8.jpg
--- a/doc/imgs/9.jpg
+++ b/doc/imgs/9.jpg
--- a/doc/imgs_results/2.jpg
+++ b/doc/imgs_results/2.jpg
--- a/doc/imgs_results/det_res_00018069.jpg
+++ b/doc/imgs_results/det_res_00018069.jpg
--- a/doc/imgs_results/det_res_2.jpg
+++ b/doc/imgs_results/det_res_2.jpg
--- a/doc/imgs_results/det_res_22.jpg
+++ b/doc/imgs_results/det_res_22.jpg
--- a/ppocr/losses/det_db_loss.py
+++ b/ppocr/losses/det_db_loss.py
@ -47,11 +47,12 @@ class DBLoss(nn.Layer):
            negative_ratio=ohem_ratio)

    def forward(self, predicts, labels):
+        predict_maps = predicts['maps']
        label_threshold_map, label_threshold_mask, label_shrink_map, label_shrink_mask = labels[
            1:]
-        shrink_maps = predicts[:, 0, :, :]
-        threshold_maps = predicts[:, 1, :, :]
-        binary_maps = predicts[:, 2, :, :]
+        shrink_maps = predict_maps[:, 0, :, :]
+        threshold_maps = predict_maps[:, 1, :, :]
+        binary_maps = predict_maps[:, 2, :, :]

        loss_shrink_maps = self.bce_loss(shrink_maps, label_shrink_map,
                                         label_shrink_mask)
--- a/ppocr/modeling/heads/det_db_head.py
+++ b/ppocr/modeling/heads/det_db_head.py
@ -120,9 +120,9 @@ class DBHead(nn.Layer):
    def forward(self, x):
        shrink_maps = self.binarize(x)
        if not self.training:
-            return shrink_maps
+            return {'maps': shrink_maps}

        threshold_maps = self.thresh(x)
        binary_maps = self.step_function(shrink_maps, threshold_maps)
        y = paddle.concat([shrink_maps, threshold_maps, binary_maps], axis=1)
-        return y
+        return {'maps': y}
--- a/ppocr/postprocess/db_postprocess.py
+++ b/ppocr/postprocess/db_postprocess.py
@ -40,7 +40,8 @@ class DBPostProcess(object):
        self.max_candidates = max_candidates
        self.unclip_ratio = unclip_ratio
        self.min_size = 3
-        self.dilation_kernel = None if not use_dilation else np.array([[1, 1], [1, 1]])
+        self.dilation_kernel = None if not use_dilation else np.array(
+            [[1, 1], [1, 1]])

    def boxes_from_bitmap(self, pred, _bitmap, dest_width, dest_height):
        '''
@ -132,7 +133,8 @@ class DBPostProcess(object):
        cv2.fillPoly(mask, box.reshape(1, -1, 2).astype(np.int32), 1)
        return cv2.mean(bitmap[ymin:ymax + 1, xmin:xmax + 1], mask)[0]

-    def __call__(self, pred, shape_list):
+    def __call__(self, outs_dict, shape_list):
+        pred = outs_dict['maps']
        if isinstance(pred, paddle.Tensor):
            pred = pred.numpy()
        pred = pred[:, 0, :, :]
--- a/ppocr/utils/save_load.py
+++ b/ppocr/utils/save_load.py
@ -102,7 +102,6 @@ def init_model(config, model, logger, optimizer=None, lr_scheduler=None):
            best_model_dict = states_dict.get('best_model_dict', {})
            if 'epoch' in states_dict:
                best_model_dict['start_epoch'] = states_dict['epoch'] + 1
-            best_model_dict['start_epoch'] = best_model_dict['best_epoch'] + 1

        logger.info("resume from {}".format(checkpoints))
    elif pretrained_model:
--- a/tools/infer/predict_det.py
+++ b/tools/infer/predict_det.py
@ -65,12 +65,12 @@ class TextDetector(object):
            postprocess_params["unclip_ratio"] = args.det_db_unclip_ratio
            postprocess_params["use_dilation"] = True
        elif self.det_algorithm == "EAST":
-            postprocess_params['name'] = 'EASTPostProcess'      
+            postprocess_params['name'] = 'EASTPostProcess'
            postprocess_params["score_thresh"] = args.det_east_score_thresh
            postprocess_params["cover_thresh"] = args.det_east_cover_thresh
            postprocess_params["nms_thresh"] = args.det_east_nms_thresh
        elif self.det_algorithm == "SAST":
-            postprocess_params['name'] = 'SASTPostProcess'      
+            postprocess_params['name'] = 'SASTPostProcess'
            postprocess_params["score_thresh"] = args.det_sast_score_thresh
            postprocess_params["nms_thresh"] = args.det_sast_nms_thresh
            self.det_sast_polygon = args.det_sast_polygon
@ -177,8 +177,10 @@ class TextDetector(object):
            preds['f_score'] = outputs[1]
            preds['f_tco'] = outputs[2]
            preds['f_tvo'] = outputs[3]
+        elif self.det_algorithm == 'DB':
+            preds['maps'] = outputs[0]
        else:
-            preds = outputs[0]
+            raise NotImplementedError

        post_result = self.postprocess_op(preds, shape_list)
        dt_boxes = post_result[0]['points']