Merge pull request #761 from LDOUBLEV/fixocr

add benchmark
2020-09-20 02:17:30 +08:00 · 2020-09-20 02:17:30 +08:00 · 5ea014f5bf
parent 1ee51e2f56 fe35728b7d
commit 5ea014f5bf
4 changed files with 98 additions and 20 deletions
--- a/doc/datasets/doc.jpg
+++ b/doc/datasets/doc.jpg
--- a/doc/doc_ch/benchmark.md
+++ b/doc/doc_ch/benchmark.md
@ -1,29 +1,51 @@
 # Benchmark
-本文给出了PaddleOCR超轻量中文模型（8.6M）在各平台的预测耗时benchmark。
+本文给出了中英文OCR系列模型精度指标和在各平台预测耗时的benchmark。
 ## 测试数据  
- 从中文公开数据集[ICDAR2017-RCTW](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/datasets.md#ICDAR2017-RCTW-17)中随机采样**500**张图像。  
+针对OCR实际应用场景，包括合同，车牌，铭牌，火车票，化验单，表格，证书，街景文字，名片，数码显示屏等，收集的300张图像，每张图平均有17个文本框，下图给出了一些图像示例。
-该集合大部分图片是通过手机摄像头在野外采集的。有些是截图。这些图片展示了各种各样的场景，包括街景、海报、菜单、室内场景和手机应用程序的截图。
+
 <div align="center">
 <img src="../datasets/doc.jpg"  width = "800" height = "200" />
 </div>
 ## 评估指标  
 在四种平台上的预测耗时指标如下：  
-|长边尺寸(px)|T4(s)|V100(s)|Intel至强6148(s)|骁龙855(s)|
+说明：
-|-|-|-|-|-|
+- v1.0是未添加优化策略的DB+CRNN模型，v1.1是添加多种优化策略和方向分类器的PP-OCR模型。slim_v1.1是使用裁剪或量化的模型。
-|960|0.092|0.057|0.319|0.354|
+- 检测输入图像的的长边尺寸是960。
 |640|0.067|0.045|0.198|0.236|
 |480|0.057|0.043|0.151|0.175| 
 说明： 
 - 评估耗时阶段为图像输入到结果输出的完整阶段，包括了图像的预处理和后处理。  
- `Intel至强6148`为服务器端CPU型号，测试中使用Intel MKL-DNN 加速CPU预测速度，使用该操作需要：  
+- `Intel至强6148`为服务器端CPU型号，测试中使用Intel MKL-DNN 加速。
    - 更新到飞桨latest版本：https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-dev ，请根据自己环境的CUDA版本和Python版本选择相应的mkl版wheel包，如，CUDA10、Python3.7环境，应操作：
    ```shell
    # 获取安装包
    wget https://paddle-wheel.bj.bcebos.com/0.0.0-gpu-cuda10-cudnn7-mkl/paddlepaddle_gpu-0.0.0-cp37-cp37m-linux_x86_64.whl
    # 安装
    pip3.7 install paddlepaddle_gpu-0.0.0-cp37-cp37m-linux_x86_64.whl
    ```
    - 预测时使用参数打开加速开关： `--enable_mkldnn True`  
 - `骁龙855`为移动端处理平台型号。  
 不同预测模型大小和整体识别精度对比
 | 模型名称                     | 整体模型<br>大小\(M\) | 检测模型<br>大小\(M\) | 方向分类器<br>模型大小\(M\) | 识别模型<br>大小\(M\) | 整体识别<br>F\-score |
 |:-:|:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1 | 8\.1        | 2\.6        | 0\.9           | 4\.6        | 0\.5193      |
 | ch\_ppocr\_server\_v1\.1 | 155\.1      | 47\.2       | 0\.9           | 107         | 0\.5414      |
 | ch\_ppocr\_mobile\_v1\.0 | 8\.6        | 4\.1        | \-             | 4\.5        | 0\.393       |
 | ch\_ppocr\_server\_v1\.0 | 203\.8      | 98\.5       | \-             | 105\.3      | 0\.4436      |
 不同预测模型在T4 GPU上预测速度对比，单位ms
 | 模型名称                     | 整体  | 检测 | 方向分类器 | 识别  |
 |:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1 | 137 | 35 | 24    | 78  |
 | ch\_ppocr\_server\_v1\.1 | 204 | 39 | 25    | 140 |
 | ch\_ppocr\_mobile\_v1\.0 | 117 | 41 | \-    | 76  |
 | ch\_ppocr\_server\_v1\.0 | 199 | 52 | \-    | 147 |
 不同预测模型在CPU上预测速度对比，单位ms
 | 模型名称                     | 整体   | 检测  | 方向分类器 | 识别  |
 |:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1 | 421  | 164 | 51    | 206 |
 | ch\_ppocr\_mobile\_v1\.0 | 398  | 219 | \-    | 179 |
 裁剪量化模型和原始模型模型大小，整体识别精度和在SD 855上预测速度对比
 | 模型名称                           | 整体模型<br>大小\(M\) | 检测模型<br>大小\(M\) | 方向分类器<br>模型大小\(M\) | 识别模型<br>大小\(M\) | 整体识别<br>F\-score | SD 855<br>\(ms\) |
 |:-:|:-:|:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1       | 8\.1        | 2\.6        | 0\.9           | 4\.6        | 0\.5193      | 306          |
 | ch\_ppocr\_mobile\_slim\_v1\.1 | 3\.5        | 1\.4        | 0\.5           | 1\.6        | 0\.521       | 268          |
--- a/doc/doc_ch/benchmark_en.md
+++ b/doc/doc_ch/benchmark_en.md
@ -0,0 +1,56 @@
 # BENCHMARK
 This document gives the performance of the series models for Chinese and English recognition.
 ## TEST DATA
 We collected 300 images for different real application scenarios to evaluate the overall OCR system, including contract samples, license plates, nameplates, train tickets, test sheets, forms, certificates, street view images, business cards, digital meter, etc. The following figure shows some images of the test set.
 <div align="center">
 <img src="../datasets/doc.jpg"  width = "800" height = "200" />
 </div>
 ## MEASUREMENT
 Explanation:
 - v1.0 indicates DB+CRNN models without the strategies. v1.1 indicates the PP-OCR models with the strategies and the direction classify. slim_v1.1 indicates the PP-OCR models with prunner or quantization.
 - The long size of the input for the text detector is 960.
 - The evaluation time-consuming stage is the complete stage from image input to result output, including image pre-processing and post-processing.
 - ```Intel Xeon 6148``` is the server-side CPU model. Intel MKL-DNN is used in the test to accelerate the CPU prediction speed.
 - ```Snapdragon 855``` is a mobile processing platform model.
 Compares the model size and F-score:
 | Model Name                    | Model Size <br> of the <br> Whole System\(M\) | Model Size <br>of the Text <br> Detector\(M\) | Model Size <br> of the Direction <br> Classifier\(M\) | Model Size<br>of the Text <br> Recognizer \(M\) | F\-score |
 |:-:|:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1 | 8\.1        | 2\.6        | 0\.9           | 4\.6        | 0\.5193      |
 | ch\_ppocr\_server\_v1\.1 | 155\.1      | 47\.2       | 0\.9           | 107         | 0\.5414      |
 | ch\_ppocr\_mobile\_v1\.0 | 8\.6        | 4\.1        | \-             | 4\.5        | 0\.393       |
 | ch\_ppocr\_server\_v1\.0 | 203\.8      | 98\.5       | \-             | 105\.3      | 0\.4436      |
 Compares the time-consuming on T4 GPU (ms):
 | Model Name                     | Overall  | Text Detector  | Direction Classifier  | Text Recognizer |
 |:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1 | 137 | 35 | 24    | 78  |
 | ch\_ppocr\_server\_v1\.1 | 204 | 39 | 25    | 140 |
 | ch\_ppocr\_mobile\_v1\.0 | 117 | 41 | \-    | 76  |
 | ch\_ppocr\_server\_v1\.0 | 199 | 52 | \-    | 147 |
 Compares the time-consuming on CPU (ms):
 | Model Name                     | Overall  | Text Detector  | Direction Classifier  | Text Recognizer |
 |:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1 | 421  | 164 | 51    | 206 |
 | ch\_ppocr\_mobile\_v1\.0 | 398  | 219 | \-    | 179 |
 Compares the model size, F-score, the time-consuming on SD 855 of between the slim models and the original models:
 | Model Name                          | Model Size <br> of the <br> Whole System\(M\) | Model Size <br>of the Text <br> Detector\(M\) | Model Size <br> of the Direction <br> Classifier\(M\) | Model Size<br>of the Text <br> Recognizer \(M\) | F\-score | SD 855<br>\(ms\) |
 |:-:|:-:|:-:|:-:|:-:|:-:|:-:|
 | ch\_ppocr\_mobile\_v1\.1       | 8\.1        | 2\.6        | 0\.9           | 4\.6        | 0\.5193      | 306          |
 | ch\_ppocr\_mobile\_slim\_v1\.1 | 3\.5        | 1\.4        | 0\.5           | 1\.6        | 0\.521       | 268          |
--- a/doc/doc_ch/framework.png
+++ b/doc/doc_ch/framework.png