Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleOCR into develop-nblib
This commit is contained in:
commit
f7ae9a5fb3
|
@ -21,3 +21,7 @@ output/
|
||||||
*.log
|
*.log
|
||||||
.clang-format
|
.clang-format
|
||||||
.clang_format.hook
|
.clang_format.hook
|
||||||
|
|
||||||
|
build/
|
||||||
|
dist/
|
||||||
|
paddleocr.egg-info/
|
|
@ -0,0 +1,8 @@
|
||||||
|
include LICENSE.txt
|
||||||
|
include README.md
|
||||||
|
|
||||||
|
recursive-include ppocr/utils *.txt utility.py character.py check.py
|
||||||
|
recursive-include ppocr/data/det *.py
|
||||||
|
recursive-include ppocr/postprocess *.py
|
||||||
|
recursive-include ppocr/postprocess/lanms *.*
|
||||||
|
recursive-include tools/infer *.py
|
282
README.md
282
README.md
|
@ -1,209 +1,139 @@
|
||||||
[English](README_en.md) | 简体中文
|
English | [简体中文](README_ch.md)
|
||||||
|
|
||||||
## 简介
|
## Introduction
|
||||||
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。
|
PaddleOCR aims to create rich, leading, and practical OCR tools that help users train better models and apply them into practice.
|
||||||
|
|
||||||
**近期更新**
|
**Recent updates**
|
||||||
- 2020.7.15 添加基于EasyEdge和Paddle-Lite的移动端DEMO,支持iOS和Android系统
|
- 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
|
||||||
- 2020.7.15 完善预测部署,添加基于C++预测引擎推理、服务化部署和端侧部署方案,以及超轻量级中文OCR模型预测耗时Benchmark
|
- 2020.9.19 Update the ultra lightweight compressed ppocr_mobile_slim series models, the overall model size is 3.5M (see [PP-OCR Pipline](#PP-OCR-Pipline)), suitable for mobile deployment. [Model Downloads](#Supported-Chinese-model-list)
|
||||||
- 2020.7.15 整理OCR相关数据集、常用数据标注以及合成工具
|
- 2020.9.17 Update the ultra lightweight ppocr_mobile series and general ppocr_server series Chinese and English ocr models, which are comparable to commercial effects. [Model Downloads](#Supported-Chinese-model-list)
|
||||||
- 2020.7.9 添加支持空格的识别模型,识别效果,预测及训练方式请参考快速开始和文本识别训练相关文档
|
- 2020.8.24 Support the use of PaddleOCR through whl package installation,pelease refer [PaddleOCR Package](./doc/doc_en/whl_en.md)
|
||||||
- 2020.7.9 添加数据增强、学习率衰减策略,具体参考[配置文件](./doc/doc_ch/config.md)
|
- 2020.8.21 Update the replay and PPT of the live lesson at Bilibili on August 18, lesson 2, easy to learn and use OCR tool spree. [Get Address](https://aistudio.baidu.com/aistudio/education/group/info/1519)
|
||||||
- [more](./doc/doc_ch/update.md)
|
- [more](./doc/doc_en/update_en.md)
|
||||||
|
|
||||||
## 特性
|
## Features
|
||||||
- 超轻量级中文OCR模型,总模型仅8.6M
|
- PPOCR series of high-quality pre-trained models, comparable to commercial effects
|
||||||
- 单模型支持中英文数字组合识别、竖排文本识别、长文本识别
|
- Ultra lightweight ppocr_mobile series models: detection (2.6M) + direction classifier (0.9M) + recognition (4.6M) = 8.1M
|
||||||
- 检测模型DB(4.1M)+识别模型CRNN(4.5M)
|
- General ppocr_server series models: detection (47.2M) + direction classifier (0.9M) + recognition (107M) = 155.1M
|
||||||
- 实用通用中文OCR模型
|
- Ultra lightweight compression ppocr_mobile_slim series models: detection (1.4M) + direction classifier (0.5M) + recognition (1.6M) = 3.5M
|
||||||
- 多种预测推理部署方案,包括服务部署和端侧部署
|
- Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
|
||||||
- 多种文本检测训练算法,EAST、DB
|
- Support multi-language recognition: Korean, Japanese, German, French
|
||||||
- 多种文本识别训练算法,Rosetta、CRNN、STAR-Net、RARE
|
- Support user-defined training, provides rich predictive inference deployment solutions
|
||||||
- 可运行于Linux、Windows、MacOS等多种系统
|
- Support PIP installation, easy to use
|
||||||
|
- Support Linux, Windows, MacOS and other systems
|
||||||
|
|
||||||
## 快速体验
|
## Visualization
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="doc/imgs_results/11.jpg" width="800">
|
<img src="doc/imgs_results/1101.jpg" width="800">
|
||||||
|
<img src="doc/imgs_results/1103.jpg" width="800">
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
上图是超轻量级中文OCR模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)。
|
The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see [More visualizations](./doc/doc_en/visualization_en.md).
|
||||||
|
|
||||||
- 超轻量级中文OCR在线体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr
|
## Quick Experience
|
||||||
- 移动端DEMO体验(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统):[安装包二维码获取地址](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
|
|
||||||
|
|
||||||
Android手机也可以扫描下面二维码安装体验。
|
You can also quickly experience the ultra-lightweight OCR : [Online Experience](https://www.paddlepaddle.org.cn/hub/scene/ocr)
|
||||||
|
|
||||||
|
Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): [Sign in to the website to obtain the QR code for installing the App](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)
|
||||||
|
|
||||||
|
Also, you can scan the QR code below to install the App (**Android support only**)
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="./doc/ocr-android-easyedge.png" width = "200" height = "200" />
|
<img src="./doc/ocr-android-easyedge.png" width = "200" height = "200" />
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
- [**中文OCR模型快速使用**](./doc/doc_ch/quickstart.md)
|
- [**OCR Quick Start**](./doc/doc_en/quickstart_en.md)
|
||||||
|
|
||||||
|
<a name="Supported-Chinese-model-list"></a>
|
||||||
|
|
||||||
|
## PP-OCR 1.1 series model list(Update on Sep 17)
|
||||||
|
|
||||||
|
| Model introduction | Model name | Recommended scene | Detection model | Direction classifier | Recognition model |
|
||||||
|
| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
|
||||||
|
| Chinese and English ultra-lightweight OCR model (8.1M) | ch_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) |
|
||||||
|
| Chinese and English general OCR model (155.1M) | ch_ppocr_server_v1.1_xx | Server | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) |
|
||||||
|
| Chinese and English ultra-lightweight compressed OCR model (3.5M) | ch_ppocr_mobile_slim_v1.1_xx | Mobile | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_opt.nb) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_cls_quant_opt.nb) | [inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_opt.nb) |
|
||||||
|
|
||||||
|
For more model downloads (including multiple languages), please refer to [PP-OCR v1.1 series model downloads](./doc/doc_en/models_list_en.md)
|
||||||
|
|
||||||
|
|
||||||
## 中文OCR模型列表
|
## Tutorials
|
||||||
|
- [Installation](./doc/doc_en/installation_en.md)
|
||||||
|
- [Quick Start](./doc/doc_en/quickstart_en.md)
|
||||||
|
- [Code Structure](./doc/doc_en/tree_en.md)
|
||||||
|
- Algorithm introduction
|
||||||
|
- [Text Detection Algorithm](./doc/doc_en/algorithm_overview_en.md)
|
||||||
|
- [Text Recognition Algorithm](./doc/doc_en/algorithm_overview_en.md)
|
||||||
|
- [PP-OCR Pipline](#PP-OCR-Pipline)
|
||||||
|
- Model training/evaluation
|
||||||
|
- [Text Detection](./doc/doc_en/detection_en.md)
|
||||||
|
- [Text Recognition](./doc/doc_en/recognition_en.md)
|
||||||
|
- [Direction Classification](./doc/doc_en/angle_class_en.md)
|
||||||
|
- [Yml Configuration](./doc/doc_en/config_en.md)
|
||||||
|
- Inference and Deployment
|
||||||
|
- [Quick inference based on pip](./doc/doc_en/whl_en.md)
|
||||||
|
- [Python Inference](./doc/doc_en/inference_en.md)
|
||||||
|
- [C++ Inference](./deploy/cpp_infer/readme_en.md)
|
||||||
|
- [Serving](./deploy/hubserving/readme_en.md)
|
||||||
|
- [Mobile](./deploy/lite/readme_en.md)
|
||||||
|
- [Model Quantization](./deploy/slim/quantization/README_en.md)
|
||||||
|
- [Model Compression](./deploy/slim/prune/README_en.md)
|
||||||
|
- [Benchmark](./doc/doc_en/benchmark_en.md)
|
||||||
|
- Datasets
|
||||||
|
- [General OCR Datasets(Chinese/English)](./doc/doc_en/datasets_en.md)
|
||||||
|
- [HandWritten_OCR_Datasets(Chinese)](./doc/doc_en/handwritten_datasets_en.md)
|
||||||
|
- [Various OCR Datasets(multilingual)](./doc/doc_en/vertical_and_multilingual_datasets_en.md)
|
||||||
|
- [Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
|
||||||
|
- [Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
|
||||||
|
- [Visualization](#Visualization)
|
||||||
|
- [FAQ](./doc/doc_en/FAQ_en.md)
|
||||||
|
- [Community](#Community)
|
||||||
|
- [References](./doc/doc_en/reference_en.md)
|
||||||
|
- [License](#LICENSE)
|
||||||
|
- [Contribution](#CONTRIBUTION)
|
||||||
|
|
||||||
|模型名称|模型简介|检测模型地址|识别模型地址|支持空格的识别模型地址|
|
<a name="PP-OCR-Pipline"></a>
|
||||||
|-|-|-|-|-|
|
|
||||||
|chinese_db_crnn_mobile|超轻量级中文OCR模型|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
|
|
||||||
|chinese_db_crnn_server|通用中文OCR模型|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)
|
|
||||||
|
|
||||||
## 文档教程
|
## PP-OCR Pipline
|
||||||
- [快速安装](./doc/doc_ch/installation.md)
|
|
||||||
- [中文OCR模型快速使用](./doc/doc_ch/quickstart.md)
|
|
||||||
- 算法介绍
|
|
||||||
- [文本检测](#文本检测算法)
|
|
||||||
- [文本识别](#文本识别算法)
|
|
||||||
- [端到端OCR](#端到端OCR算法)
|
|
||||||
- 模型训练/评估
|
|
||||||
- [文本检测](./doc/doc_ch/detection.md)
|
|
||||||
- [文本识别](./doc/doc_ch/recognition.md)
|
|
||||||
- [yml参数配置文件介绍](./doc/doc_ch/config.md)
|
|
||||||
- 中文OCR训练预测技巧
|
|
||||||
- 预测部署
|
|
||||||
- [基于Python预测引擎推理](./doc/doc_ch/inference.md)
|
|
||||||
- [基于C++预测引擎推理](./deploy/cpp_infer/readme.md)
|
|
||||||
- [服务化部署](./doc/doc_ch/serving.md)
|
|
||||||
- [端侧部署](./deploy/lite/readme.md)
|
|
||||||
- 模型量化压缩
|
|
||||||
- [Benchmark](./doc/doc_ch/benchmark.md)
|
|
||||||
- 数据集
|
|
||||||
- [通用中英文OCR数据集](./doc/doc_ch/datasets.md)
|
|
||||||
- [手写中文OCR数据集](./doc/doc_ch/handwritten_datasets.md)
|
|
||||||
- 垂类多语言OCR数据集
|
|
||||||
- [常用数据标注工具](./doc/doc_ch/data_annotation.md)
|
|
||||||
- [常用数据合成工具](./doc/doc_ch/data_synthesis.md)
|
|
||||||
- [FAQ](#FAQ)
|
|
||||||
- 效果展示
|
|
||||||
- [超轻量级中文OCR效果展示](#超轻量级中文OCR效果展示)
|
|
||||||
- [通用中文OCR效果展示](#通用中文OCR效果展示)
|
|
||||||
- [支持空格的中文OCR效果展示](#支持空格的中文OCR效果展示)
|
|
||||||
- [技术交流群](#欢迎加入PaddleOCR技术交流群)
|
|
||||||
- [参考文献](./doc/doc_ch/reference.md)
|
|
||||||
- [许可证书](#许可证书)
|
|
||||||
- [贡献代码](#贡献代码)
|
|
||||||
|
|
||||||
<a name="算法介绍"></a>
|
|
||||||
## 算法介绍
|
|
||||||
<a name="文本检测算法"></a>
|
|
||||||
### 1.文本检测算法
|
|
||||||
|
|
||||||
PaddleOCR开源的文本检测算法列表:
|
|
||||||
- [x] EAST([paper](https://arxiv.org/abs/1704.03155))
|
|
||||||
- [x] DB([paper](https://arxiv.org/abs/1911.08947))
|
|
||||||
- [ ] SAST([paper](https://arxiv.org/abs/1908.05498))(百度自研, comming soon)
|
|
||||||
|
|
||||||
在ICDAR2015文本检测公开数据集上,算法效果如下:
|
|
||||||
|
|
||||||
|模型|骨干网络|precision|recall|Hmean|下载链接|
|
|
||||||
|-|-|-|-|-|-|
|
|
||||||
|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[下载链接](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|
|
|
||||||
|EAST|MobileNetV3|81.67%|79.83%|80.74%|[下载链接](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|
|
|
||||||
|DB|ResNet50_vd|83.79%|80.65%|82.19%|[下载链接](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|
|
|
||||||
|DB|MobileNetV3|75.92%|73.18%|74.53%|[下载链接](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|
|
|
||||||
|
|
||||||
使用[LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/datasets.md#1icdar2019-lsvt)街景数据集共3w张数据,训练中文检测模型的相关配置和预训练文件如下:
|
|
||||||
|模型|骨干网络|配置文件|预训练模型|
|
|
||||||
|-|-|-|-|
|
|
||||||
|超轻量中文模型|MobileNetV3|det_mv3_db.yml|[下载链接](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|
|
|
||||||
|通用中文OCR模型|ResNet50_vd|det_r50_vd_db.yml|[下载链接](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|
|
|
||||||
|
|
||||||
* 注: 上述DB模型的训练和评估,需设置后处理参数box_thresh=0.6,unclip_ratio=1.5,使用不同数据集、不同模型训练,可调整这两个参数进行优化
|
|
||||||
|
|
||||||
PaddleOCR文本检测算法的训练和使用请参考文档教程中[模型训练/评估中的文本检测部分](./doc/doc_ch/detection.md)。
|
|
||||||
|
|
||||||
<a name="文本识别算法"></a>
|
|
||||||
### 2.文本识别算法
|
|
||||||
|
|
||||||
PaddleOCR开源的文本识别算法列表:
|
|
||||||
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))
|
|
||||||
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))
|
|
||||||
- [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
|
|
||||||
- [x] RARE([paper](https://arxiv.org/abs/1603.03915v1))
|
|
||||||
- [ ] SRN([paper](https://arxiv.org/abs/2003.12294))(百度自研, comming soon)
|
|
||||||
|
|
||||||
参考[DTRB](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
|
|
||||||
|
|
||||||
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
|
|
||||||
|-|-|-|-|-|
|
|
||||||
|Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|
|
|
||||||
|Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|
|
|
||||||
|CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|
|
|
||||||
|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|
|
|
||||||
|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|
|
|
||||||
|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|
|
|
||||||
|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
|
|
||||||
|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
|
|
||||||
|
|
||||||
使用[LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/datasets.md#1icdar2019-lsvt)街景数据集根据真值将图crop出来30w数据,进行位置校准。此外基于LSVT语料生成500w合成数据训练中文模型,相关配置和预训练文件如下:
|
|
||||||
|
|
||||||
|模型|骨干网络|配置文件|预训练模型|
|
|
||||||
|-|-|-|-|
|
|
||||||
|超轻量中文模型|MobileNetV3|rec_chinese_lite_train.yml|[下载链接](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|
|
|
||||||
|通用中文OCR模型|Resnet34_vd|rec_chinese_common_train.yml|[下载链接](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|
|
|
||||||
|
|
||||||
PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训练/评估中的文本识别部分](./doc/doc_ch/recognition.md)。
|
|
||||||
|
|
||||||
<a name="端到端OCR算法"></a>
|
|
||||||
### 3.端到端OCR算法
|
|
||||||
- [ ] [End2End-PSL](https://arxiv.org/abs/1909.07808)(百度自研, comming soon)
|
|
||||||
|
|
||||||
## 效果展示
|
|
||||||
|
|
||||||
<a name="超轻量级中文OCR效果展示"></a>
|
|
||||||
### 1.超轻量级中文OCR效果展示 [more](./doc/doc_ch/visualization.md)
|
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="doc/imgs_results/1.jpg" width="800">
|
<img src="./doc/ppocr_framework.png" width="800">
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<a name="通用中文OCR效果展示"></a>
|
PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection, detection frame correction and CRNN text recognition. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941).
|
||||||
### 2.通用中文OCR效果展示 [more](./doc/doc_ch/visualization.md)
|
|
||||||
|
## Visualization [more](./doc/doc_en/visualization_en.md)
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="doc/imgs_results/chinese_db_crnn_server/11.jpg" width="800">
|
<img src="./doc/imgs_results/1102.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1104.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1106.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1105.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1110.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1112.jpg" width="800">
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<a name="支持空格的中文OCR效果展示"></a>
|
<a name="Community"></a>
|
||||||
### 3.支持空格的中文OCR效果展示 [more](./doc/doc_ch/visualization.md)
|
## Community
|
||||||
|
Scan the QR code below with your Wechat and completing the questionnaire, you can access to offical technical exchange group.
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="doc/imgs_results/chinese_db_crnn_server/en_paper.jpg" width="800">
|
<img src="./doc/joinus.PNG" width = "200" height = "200" />
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<a name="FAQ"></a>
|
<a name="LICENSE"></a>
|
||||||
## FAQ
|
## License
|
||||||
1. **转换attention识别模型时报错:KeyError: 'predict'**
|
This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>
|
||||||
问题已解,请更新到最新代码。
|
|
||||||
|
|
||||||
2. **关于推理速度**
|
<a name="CONTRIBUTION"></a>
|
||||||
图片中的文字较多时,预测时间会增,可以使用--rec_batch_num设置更小预测batch num,默认值为30,可以改为10或其他数值。
|
## Contribution
|
||||||
|
We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.
|
||||||
|
|
||||||
3. **服务部署与移动端部署**
|
- Many thanks to [Khanh Tran](https://github.com/xxxpsyduck) and [Karl Horky](https://github.com/karlhorky) for contributing and revising the English documentation.
|
||||||
预计6月中下旬会先后发布基于Serving的服务部署方案和基于Paddle Lite的移动端部署方案,欢迎持续关注。
|
- Many thanks to [zhangxin](https://github.com/ZhangXinNan) for contributing the new visualize function、add .gitgnore and discard set PYTHONPATH manually.
|
||||||
|
- Many thanks to [lyl120117](https://github.com/lyl120117) for contributing the code for printing the network structure.
|
||||||
4. **自研算法发布时间**
|
- Thanks [xiangyubo](https://github.com/xiangyubo) for contributing the handwritten Chinese OCR datasets.
|
||||||
自研算法SAST、SRN、End2End-PSL都将在7-8月陆续发布,敬请期待。
|
- Thanks [authorfu](https://github.com/authorfu) for contributing Android demo and [xiadeye](https://github.com/xiadeye) contributing iOS demo, respectively.
|
||||||
|
- Thanks [BeyondYourself](https://github.com/BeyondYourself) for contributing many great suggestions and simplifying part of the code style.
|
||||||
[more](./doc/doc_ch/FAQ.md)
|
- Thanks [tangmq](https://gitee.com/tangmq) for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
|
||||||
|
|
||||||
<a name="欢迎加入PaddleOCR技术交流群"></a>
|
|
||||||
## 欢迎加入PaddleOCR技术交流群
|
|
||||||
请扫描下面二维码,完成问卷填写,获取加群二维码和OCR方向的炼丹秘籍
|
|
||||||
|
|
||||||
<div align="center">
|
|
||||||
<img src="./doc/joinus.jpg" width = "200" height = "200" />
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<a name="许可证书"></a>
|
|
||||||
## 许可证书
|
|
||||||
本项目的发布受<a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>许可认证。
|
|
||||||
|
|
||||||
<a name="贡献代码"></a>
|
|
||||||
## 贡献代码
|
|
||||||
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。
|
|
||||||
|
|
||||||
- 非常感谢 [Khanh Tran](https://github.com/xxxpsyduck) 贡献了英文文档。
|
|
||||||
- 非常感谢 [zhangxin](https://github.com/ZhangXinNan)([Blog](https://blog.csdn.net/sdlypyzq)) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题
|
|
||||||
- 非常感谢 [lyl120117](https://github.com/lyl120117) 贡献打印网络结构的代码
|
|
||||||
- 非常感谢 [xiangyubo](https://github.com/xiangyubo) 贡献手写中文OCR数据集
|
|
||||||
|
|
|
@ -0,0 +1,141 @@
|
||||||
|
[English](README.md) | 简体中文
|
||||||
|
|
||||||
|
## 简介
|
||||||
|
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。
|
||||||
|
|
||||||
|
**近期更新**
|
||||||
|
- 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941
|
||||||
|
- 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载)
|
||||||
|
- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载)
|
||||||
|
- 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./doc/doc_ch/FAQ.md)
|
||||||
|
- 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md)
|
||||||
|
- 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519)
|
||||||
|
- [More](./doc/doc_ch/update.md)
|
||||||
|
|
||||||
|
|
||||||
|
## 特性
|
||||||
|
|
||||||
|
- PPOCR系列高质量预训练模型,准确的识别效果
|
||||||
|
- 超轻量ppocr_mobile移动端系列:检测(2.6M)+方向分类器(0.9M)+ 识别(4.6M)= 8.1M
|
||||||
|
- 通用ppocr_server系列:检测(47.2M)+方向分类器(0.9M)+ 识别(107M)= 155.1M
|
||||||
|
- 超轻量压缩ppocr_mobile_slim系列:检测(1.4M)+方向分类器(0.5M)+ 识别(1.6M)= 3.5M
|
||||||
|
- 支持中英文数字组合识别、竖排文本识别、长文本识别
|
||||||
|
- 支持多语言识别:韩语、日语、德语、法语
|
||||||
|
- 支持用户自定义训练,提供丰富的预测推理部署方案
|
||||||
|
- 支持PIP快速安装使用
|
||||||
|
- 可运行于Linux、Windows、MacOS等多种系统
|
||||||
|
|
||||||
|
## 效果展示
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<img src="doc/imgs_results/1101.jpg" width="800">
|
||||||
|
<img src="doc/imgs_results/1103.jpg" width="800">
|
||||||
|
</div>
|
||||||
|
|
||||||
|
上图是通用ppocr_server模型效果展示,更多效果图请见[效果展示页面](./doc/doc_ch/visualization.md)。
|
||||||
|
|
||||||
|
## 快速体验
|
||||||
|
- PC端:超轻量级中文OCR在线体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr
|
||||||
|
|
||||||
|
- 移动端:[安装包DEMO下载地址](https://ai.baidu.com/easyedge/app/openSource?from=paddlelite)(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统),Android手机也可以直接扫描下面二维码安装体验。
|
||||||
|
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<img src="./doc/ocr-android-easyedge.png" width = "200" height = "200" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
- 代码体验:从[快速安装](./doc/doc_ch/installation.md) 开始
|
||||||
|
|
||||||
|
<a name="模型下载"></a>
|
||||||
|
## PP-OCR 1.1系列模型列表(9月17日更新)
|
||||||
|
|
||||||
|
| 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
|
||||||
|
| ------------ | --------------- | ----------------|---- | ---------- | -------- |
|
||||||
|
| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) |
|
||||||
|
| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) |
|
||||||
|
| 中英文超轻量压缩OCR模型(3.5M) | ch_ppocr_mobile_slim_v1.1_xx| 移动端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_opt.nb) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_cls_quant_opt.nb)| [推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)|
|
||||||
|
|
||||||
|
更多模型下载(包括多语言),可以参考[PP-OCR v1.1 系列模型下载](./doc/doc_ch/models_list.md)
|
||||||
|
|
||||||
|
## 文档教程
|
||||||
|
- [快速安装](./doc/doc_ch/installation.md)
|
||||||
|
- [中文OCR模型快速使用](./doc/doc_ch/quickstart.md)
|
||||||
|
- [代码组织结构](./doc/doc_ch/tree.md)
|
||||||
|
- 算法介绍
|
||||||
|
- [文本检测](./doc/doc_ch/algorithm_overview.md)
|
||||||
|
- [文本识别](./doc/doc_ch/algorithm_overview.md)
|
||||||
|
- [PP-OCR Pipline](#PP-OCR)
|
||||||
|
- 模型训练/评估
|
||||||
|
- [文本检测](./doc/doc_ch/detection.md)
|
||||||
|
- [文本识别](./doc/doc_ch/recognition.md)
|
||||||
|
- [方向分类器](./doc/doc_ch/angle_class.md)
|
||||||
|
- [yml参数配置文件介绍](./doc/doc_ch/config.md)
|
||||||
|
- 预测部署
|
||||||
|
- [基于pip安装whl包快速推理](./doc/doc_ch/whl.md)
|
||||||
|
- [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
|
||||||
|
- [基于C++预测引擎推理](./deploy/cpp_infer/readme.md)
|
||||||
|
- [服务化部署](./deploy/hubserving/readme.md)
|
||||||
|
- [端侧部署](./deploy/lite/readme.md)
|
||||||
|
- [模型量化](./deploy/slim/quantization/README.md)
|
||||||
|
- [模型裁剪](./deploy/slim/prune/README.md)
|
||||||
|
- [Benchmark](./doc/doc_ch/benchmark.md)
|
||||||
|
- 数据集
|
||||||
|
- [通用中英文OCR数据集](./doc/doc_ch/datasets.md)
|
||||||
|
- [手写中文OCR数据集](./doc/doc_ch/handwritten_datasets.md)
|
||||||
|
- [垂类多语言OCR数据集](./doc/doc_ch/vertical_and_multilingual_datasets.md)
|
||||||
|
- [常用数据标注工具](./doc/doc_ch/data_annotation.md)
|
||||||
|
- [常用数据合成工具](./doc/doc_ch/data_synthesis.md)
|
||||||
|
- [效果展示](#效果展示)
|
||||||
|
- FAQ
|
||||||
|
- [【精选】OCR精选10个问题](./doc/doc_ch/FAQ.md)
|
||||||
|
- [【理论篇】OCR通用21个问题](./doc/doc_ch/FAQ.md)
|
||||||
|
- [【实战篇】PaddleOCR实战53个问题](./doc/doc_ch/FAQ.md)
|
||||||
|
- [技术交流群](#欢迎加入PaddleOCR技术交流群)
|
||||||
|
- [参考文献](./doc/doc_ch/reference.md)
|
||||||
|
- [许可证书](#许可证书)
|
||||||
|
- [贡献代码](#贡献代码)
|
||||||
|
|
||||||
|
<a name="PP-OCR"></a>
|
||||||
|
## PP-OCR Pipline
|
||||||
|
<div align="center">
|
||||||
|
<img src="./doc/ppocr_framework.png" width="800">
|
||||||
|
</div>
|
||||||
|
|
||||||
|
PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框矫正和CRNN文本识别三部分组成。该系统从骨干网络选择和调整、预测头部的设计、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型自动裁剪量化8个方面,采用19个有效策略,对各个模块的模型进行效果调优和瘦身,最终得到整体大小为3.5M的超轻量中英文OCR和2.8M的英文数字OCR。更多细节请参考PP-OCR技术方案 https://arxiv.org/abs/2009.09941 。
|
||||||
|
|
||||||
|
<a name="效果展示"></a>
|
||||||
|
## 效果展示 [more](./doc/doc_ch/visualization.md)
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<img src="./doc/imgs_results/1102.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1104.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1106.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1105.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1110.jpg" width="800">
|
||||||
|
<img src="./doc/imgs_results/1112.jpg" width="800">
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<a name="欢迎加入PaddleOCR技术交流群"></a>
|
||||||
|
## 欢迎加入PaddleOCR技术交流群
|
||||||
|
请扫描下面二维码,完成问卷填写,获取加群二维码和OCR方向的炼丹秘籍
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<img src="./doc/joinus.PNG" width = "200" height = "200" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<a name="许可证书"></a>
|
||||||
|
## 许可证书
|
||||||
|
本项目的发布受<a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>许可认证。
|
||||||
|
|
||||||
|
<a name="贡献代码"></a>
|
||||||
|
## 贡献代码
|
||||||
|
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。
|
||||||
|
|
||||||
|
- 非常感谢 [Khanh Tran](https://github.com/xxxpsyduck) 和 [Karl Horky](https://github.com/karlhorky) 贡献修改英文文档
|
||||||
|
- 非常感谢 [zhangxin](https://github.com/ZhangXinNan)([Blog](https://blog.csdn.net/sdlypyzq)) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题
|
||||||
|
- 非常感谢 [lyl120117](https://github.com/lyl120117) 贡献打印网络结构的代码
|
||||||
|
- 非常感谢 [xiangyubo](https://github.com/xiangyubo) 贡献手写中文OCR数据集
|
||||||
|
- 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码
|
||||||
|
- 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。
|
||||||
|
- 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。
|
302
README_en.md
302
README_en.md
|
@ -1,302 +0,0 @@
|
||||||
English | [简体中文](README.md)
|
|
||||||
|
|
||||||
## INTRODUCTION
|
|
||||||
PaddleOCR aims to create a rich, leading, and practical OCR tools that help users train better models and apply them into practice.
|
|
||||||
|
|
||||||
**Recent updates**、
|
|
||||||
- 2020.7.9 Add recognition model to support space, [recognition result](#space Chinese OCR results). For more information: [Recognition](./doc/doc_ch/recognition.md) and [quickstart](./doc/doc_ch/quickstart.md)
|
|
||||||
- 2020.7.9 Add data auguments and learning rate decay strategies,please read [config](./doc/doc_en/config_en.md)
|
|
||||||
- 2020.6.8 Add [dataset](./doc/doc_en/datasets_en.md) and keep updating
|
|
||||||
- 2020.6.5 Support exporting `attention` model to `inference_model`
|
|
||||||
- 2020.6.5 Support separate prediction and recognition, output result score
|
|
||||||
- [more](./doc/doc_en/update_en.md)
|
|
||||||
|
|
||||||
## FEATURES
|
|
||||||
- Lightweight Chinese OCR model, total model size is only 8.6M
|
|
||||||
- Single model supports Chinese and English numbers combination recognition, vertical text recognition, long text recognition
|
|
||||||
- Detection model DB (4.1M) + recognition model CRNN (4.5M)
|
|
||||||
- Various text detection algorithms: EAST, DB
|
|
||||||
- Various text recognition algorithms: Rosetta, CRNN, STAR-Net, RARE
|
|
||||||
|
|
||||||
<a name="Supported-Chinese-model-list"></a>
|
|
||||||
### Supported Chinese models list:
|
|
||||||
|
|
||||||
|Model Name|Description |Detection Model link|Recognition Model link| Support for space Recognition Model link|
|
|
||||||
|-|-|-|-|-|
|
|
||||||
|chinese_db_crnn_mobile|lightweight Chinese OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [pre-train model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
|
|
||||||
|chinese_db_crnn_server|General Chinese OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [pre-train model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)
|
|
||||||
|
|
||||||
|
|
||||||
For testing our Chinese OCR online:https://www.paddlepaddle.org.cn/hub/scene/ocr
|
|
||||||
|
|
||||||
**You can also quickly experience the lightweight Chinese OCR and General Chinese OCR models as follows:**
|
|
||||||
|
|
||||||
## **LIGHTWEIGHT CHINESE OCR AND GENERAL CHINESE OCR INFERENCE**
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
The picture above is the result of our lightweight Chinese OCR model. For more testing results, please see the end of the article [lightweight Chinese OCR results](#lightweight-Chinese-OCR-results) , [General Chinese OCR results](#General-Chinese-OCR-results) and [Support for space Recognition Model](#Space-Chinese-OCR-results).
|
|
||||||
|
|
||||||
#### 1. ENVIRONMENT CONFIGURATION
|
|
||||||
|
|
||||||
Please see [Quick installation](./doc/doc_en/installation_en.md)
|
|
||||||
|
|
||||||
#### 2. DOWNLOAD INFERENCE MODELS
|
|
||||||
|
|
||||||
#### (1) Download lightweight Chinese OCR models
|
|
||||||
*If wget is not installed in the windows system, you can copy the link to the browser to download the model. After model downloaded, unzip it and place it in the corresponding directory*
|
|
||||||
|
|
||||||
Copy the detection and recognition 'inference model' address in [Chinese model List](#Supported-Chinese-model-list), download and unpack:
|
|
||||||
|
|
||||||
```
|
|
||||||
mkdir inference && cd inference
|
|
||||||
# Download the detection part of the Chinese OCR and decompress it
|
|
||||||
wget {url/of/detection/inference_model} && tar xf {name/of/detection/inference_model/package}
|
|
||||||
# Download the recognition part of the Chinese OCR and decompress it
|
|
||||||
wget {url/of/recognition/inference_model} && tar xf {name/of/recognition/inference_model/package}
|
|
||||||
cd ..
|
|
||||||
```
|
|
||||||
|
|
||||||
Take lightweight Chinese OCR model as an example:
|
|
||||||
|
|
||||||
```
|
|
||||||
mkdir inference && cd inference
|
|
||||||
# Download the detection part of the lightweight Chinese OCR and decompress it
|
|
||||||
wget https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar && tar xf ch_det_mv3_db_infer.tar
|
|
||||||
# Download the recognition part of the lightweight Chinese OCR and decompress it
|
|
||||||
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar && tar xf ch_rec_mv3_crnn_infer.tar
|
|
||||||
# Download the space-recognized part of the lightweight Chinese OCR and decompress it
|
|
||||||
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar && tar xf ch_rec_mv3_crnn_enhance_infer.tar
|
|
||||||
|
|
||||||
cd ..
|
|
||||||
```
|
|
||||||
|
|
||||||
After the decompression is completed, the file structure should be as follows:
|
|
||||||
|
|
||||||
```
|
|
||||||
|-inference
|
|
||||||
|-ch_rec_mv3_crnn
|
|
||||||
|- model
|
|
||||||
|- params
|
|
||||||
|-ch_det_mv3_db
|
|
||||||
|- model
|
|
||||||
|- params
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
#### 3. SINGLE IMAGE AND BATCH PREDICTION
|
|
||||||
|
|
||||||
The following code implements text detection and recognition inference tandemly. When performing prediction, you need to specify the path of a single image or image folder through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detection model, and the parameter `rec_model_dir` specifies the path to the recognition model. The visual prediction results are saved to the `./inference_results` folder by default.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
|
|
||||||
# Prediction on a single image by specifying image path to image_dir
|
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/" --rec_model_dir="./inference/ch_rec_mv3_crnn/"
|
|
||||||
|
|
||||||
# Prediction on a batch of images by specifying image folder path to image_dir
|
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_det_mv3_db/" --rec_model_dir="./inference/ch_rec_mv3_crnn/"
|
|
||||||
|
|
||||||
# If you want to use CPU for prediction, you need to set the use_gpu parameter to False
|
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/" --rec_model_dir="./inference/ch_rec_mv3_crnn/" --use_gpu=False
|
|
||||||
```
|
|
||||||
|
|
||||||
To run inference of the Generic Chinese OCR model, follow these steps above to download the corresponding models and update the relevant parameters. Examples are as follows:
|
|
||||||
```
|
|
||||||
# Prediction on a single image by specifying image path to image_dir
|
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_r50_vd_db/" --rec_model_dir="./inference/ch_rec_r34_vd_crnn/"
|
|
||||||
```
|
|
||||||
|
|
||||||
To run inference of the space-Generic Chinese OCR model, follow these steps above to download the corresponding models and update the relevant parameters. Examples are as follows:
|
|
||||||
|
|
||||||
```
|
|
||||||
# Prediction on a single image by specifying image path to image_dir
|
|
||||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_12.jpg" --det_model_dir="./inference/ch_det_r50_vd_db/" --rec_model_dir="./inference/ch_rec_r34_vd_crnn_enhance/"
|
|
||||||
```
|
|
||||||
|
|
||||||
For more text detection and recognition models, please refer to the document [Inference](./doc/doc_en/inference_en.md)
|
|
||||||
|
|
||||||
## DOCUMENTATION
|
|
||||||
- [Quick installation](./doc/doc_en/installation_en.md)
|
|
||||||
- [Text detection model training/evaluation/prediction](./doc/doc_en/detection_en.md)
|
|
||||||
- [Text recognition model training/evaluation/prediction](./doc/doc_en/recognition_en.md)
|
|
||||||
- [Inference](./doc/doc_en/inference_en.md)
|
|
||||||
- [Introduction of yml file](./doc/doc_en/config_en.md)
|
|
||||||
- [Dataset](./doc/doc_en/datasets_en.md)
|
|
||||||
- [FAQ]((#FAQ)
|
|
||||||
|
|
||||||
## TEXT DETECTION ALGORITHM
|
|
||||||
|
|
||||||
PaddleOCR open source text detection algorithms list:
|
|
||||||
- [x] EAST([paper](https://arxiv.org/abs/1704.03155))
|
|
||||||
- [x] DB([paper](https://arxiv.org/abs/1911.08947))
|
|
||||||
- [ ] SAST([paper](https://arxiv.org/abs/1908.05498))(Baidu Self-Research, comming soon)
|
|
||||||
|
|
||||||
On the ICDAR2015 dataset, the text detection result is as follows:
|
|
||||||
|
|
||||||
|Model|Backbone|precision|recall|Hmean|Download link|
|
|
||||||
|-|-|-|-|-|-|
|
|
||||||
|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|
|
|
||||||
|EAST|MobileNetV3|81.67%|79.83%|80.74%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|
|
|
||||||
|DB|ResNet50_vd|83.79%|80.65%|82.19%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|
|
|
||||||
|DB|MobileNetV3|75.92%|73.18%|74.53%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|
|
|
||||||
|
|
||||||
For use of [LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/datasets_en.md#1-icdar2019-lsvt) street view dataset with a total of 3w training data,the related configuration and pre-trained models for Chinese detection task are as follows:
|
|
||||||
|Model|Backbone|Configuration file|Pre-trained model|
|
|
||||||
|-|-|-|-|
|
|
||||||
|lightweight Chinese model|MobileNetV3|det_mv3_db.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|
|
|
||||||
|General Chinese OCR model|ResNet50_vd|det_r50_vd_db.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|
|
|
||||||
|
|
||||||
* Note: For the training and evaluation of the above DB model, post-processing parameters box_thresh=0.6 and unclip_ratio=1.5 need to be set. If using different datasets and different models for training, these two parameters can be adjusted for better result.
|
|
||||||
|
|
||||||
For the training guide and use of PaddleOCR text detection algorithms, please refer to the document [Text detection model training/evaluation/prediction](./doc/doc_en/detection_en.md)
|
|
||||||
|
|
||||||
## TEXT RECOGNITION ALGORITHM
|
|
||||||
|
|
||||||
PaddleOCR open-source text recognition algorithms list:
|
|
||||||
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))
|
|
||||||
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))
|
|
||||||
- [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
|
|
||||||
- [x] RARE([paper](https://arxiv.org/abs/1603.03915v1))
|
|
||||||
- [ ] SRN([paper](https://arxiv.org/abs/2003.12294))(Baidu Self-Research, comming soon)
|
|
||||||
|
|
||||||
Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
|
|
||||||
|
|
||||||
|Model|Backbone|Avg Accuracy|Module combination|Download link|
|
|
||||||
|-|-|-|-|-|
|
|
||||||
|Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|
|
|
||||||
|Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|
|
|
||||||
|CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|
|
|
||||||
|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|
|
|
||||||
|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|
|
|
||||||
|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|
|
|
||||||
|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
|
|
||||||
|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
|
|
||||||
|
|
||||||
We use [LSVT](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/datasets_en.md#1-icdar2019-lsvt) dataset and cropout 30w traning data from original photos by using position groundtruth and make some calibration needed. In addition, based on the LSVT corpus, 500w synthetic data is generated to train the Chinese model. The related configuration and pre-trained models are as follows:
|
|
||||||
|Model|Backbone|Configuration file|Pre-trained model|
|
|
||||||
|-|-|-|-|
|
|
||||||
|lightweight Chinese model|MobileNetV3|rec_chinese_lite_train.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) & [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)|
|
|
||||||
|General Chinese OCR model|Resnet34_vd|rec_chinese_common_train.yml|[Download link](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) & [pre-trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)|
|
|
||||||
|
|
||||||
Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./doc/doc_en/recognition_en.md)
|
|
||||||
|
|
||||||
## END-TO-END OCR ALGORITHM
|
|
||||||
- [ ] [End2End-PSL](https://arxiv.org/abs/1909.07808)(Baidu Self-Research, comming soon)
|
|
||||||
|
|
||||||
<a name="lightweight-Chinese-OCR-results"></a>
|
|
||||||
## LIGHTWEIGHT CHINESE OCR RESULTS
|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||
|
|
||||||
<a name="General-Chinese-OCR-results"></a>
|
|
||||||
## General Chinese OCR results
|
|
||||||

|
|
||||||

|
|
||||||

|
|
||||||
|
|
||||||
<a name="Space-Chinese-OCR-results"></a>
|
|
||||||
|
|
||||||
## space Chinese OCR results
|
|
||||||
|
|
||||||
### LIGHTWEIGHT CHINESE OCR RESULTS
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
### General Chinese OCR results
|
|
||||||

|
|
||||||
|
|
||||||
<a name="FAQ"></a>
|
|
||||||
## FAQ
|
|
||||||
1. Error when using attention-based recognition model: KeyError: 'predict'
|
|
||||||
|
|
||||||
The inference of recognition model based on attention loss is still being debugged. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss first. In practice, it is also found that the recognition model based on attention loss is not as effective as the one based on CTC loss.
|
|
||||||
|
|
||||||
2. About inference speed
|
|
||||||
|
|
||||||
When there are a lot of texts in the picture, the prediction time will increase. You can use `--rec_batch_num` to set a smaller prediction batch size. The default value is 30, which can be changed to 10 or other values.
|
|
||||||
|
|
||||||
3. Service deployment and mobile deployment
|
|
||||||
|
|
||||||
It is expected that the service deployment based on Serving and the mobile deployment based on Paddle Lite will be released successively in mid-to-late June. Stay tuned for more updates.
|
|
||||||
|
|
||||||
4. Release time of self-developed algorithm
|
|
||||||
|
|
||||||
Baidu Self-developed algorithms such as SAST, SRN and end2end PSL will be released in June or July. Please be patient.
|
|
||||||
|
|
||||||
[more](./doc/doc_en/FAQ_en.md)
|
|
||||||
|
|
||||||
## WELCOME TO THE PaddleOCR TECHNICAL EXCHANGE GROUP
|
|
||||||
WeChat: paddlehelp, note OCR, our assistant will get you into the group~
|
|
||||||
|
|
||||||
<img src="./doc/paddlehelp.jpg" width = "200" height = "200" />
|
|
||||||
|
|
||||||
## REFERENCES
|
|
||||||
```
|
|
||||||
1. EAST:
|
|
||||||
@inproceedings{zhou2017east,
|
|
||||||
title={EAST: an efficient and accurate scene text detector},
|
|
||||||
author={Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun},
|
|
||||||
booktitle={Proceedings of the IEEE conference on Computer Vision and Pattern Recognition},
|
|
||||||
pages={5551--5560},
|
|
||||||
year={2017}
|
|
||||||
}
|
|
||||||
|
|
||||||
2. DB:
|
|
||||||
@article{liao2019real,
|
|
||||||
title={Real-time Scene Text Detection with Differentiable Binarization},
|
|
||||||
author={Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang},
|
|
||||||
journal={arXiv preprint arXiv:1911.08947},
|
|
||||||
year={2019}
|
|
||||||
}
|
|
||||||
|
|
||||||
3. DTRB:
|
|
||||||
@inproceedings{baek2019wrong,
|
|
||||||
title={What is wrong with scene text recognition model comparisons? dataset and model analysis},
|
|
||||||
author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
|
|
||||||
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
|
|
||||||
pages={4715--4723},
|
|
||||||
year={2019}
|
|
||||||
}
|
|
||||||
|
|
||||||
4. SAST:
|
|
||||||
@inproceedings{wang2019single,
|
|
||||||
title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning},
|
|
||||||
author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming},
|
|
||||||
booktitle={Proceedings of the 27th ACM International Conference on Multimedia},
|
|
||||||
pages={1277--1285},
|
|
||||||
year={2019}
|
|
||||||
}
|
|
||||||
|
|
||||||
5. SRN:
|
|
||||||
@article{yu2020towards,
|
|
||||||
title={Towards Accurate Scene Text Recognition with Semantic Reasoning Networks},
|
|
||||||
author={Yu, Deli and Li, Xuan and Zhang, Chengquan and Han, Junyu and Liu, Jingtuo and Ding, Errui},
|
|
||||||
journal={arXiv preprint arXiv:2003.12294},
|
|
||||||
year={2020}
|
|
||||||
}
|
|
||||||
|
|
||||||
6. end2end-psl:
|
|
||||||
@inproceedings{sun2019chinese,
|
|
||||||
title={Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning},
|
|
||||||
author={Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo},
|
|
||||||
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
|
|
||||||
pages={9086--9095},
|
|
||||||
year={2019}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## LICENSE
|
|
||||||
This project is released under <a href="https://github.com/PaddlePaddle/PaddleOCR/blob/master/LICENSE">Apache 2.0 license</a>
|
|
||||||
|
|
||||||
## CONTRIBUTION
|
|
||||||
We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.
|
|
||||||
|
|
||||||
- Many thanks to [Khanh Tran](https://github.com/xxxpsyduck) for contributing the English documentation.
|
|
||||||
- Many thanks to [zhangxin](https://github.com/ZhangXinNan) for contributing the new visualize function、add .gitgnore and discard set PYTHONPATH manually.
|
|
||||||
- Many thanks to [lyl120117](https://github.com/lyl120117) for contributing the code for printing the network structure.
|
|
|
@ -0,0 +1,17 @@
|
||||||
|
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
|
||||||
|
__all__ = ['PaddleOCR', 'draw_ocr']
|
||||||
|
from .paddleocr import PaddleOCR
|
||||||
|
from .tools.infer.utility import draw_ocr
|
|
@ -0,0 +1,44 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CLS
|
||||||
|
use_gpu: False
|
||||||
|
epoch_num: 100
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 100
|
||||||
|
save_model_dir: output/cls_mv3
|
||||||
|
save_epoch_step: 3
|
||||||
|
eval_batch_step: 500
|
||||||
|
train_batch_size_per_card: 512
|
||||||
|
test_batch_size_per_card: 512
|
||||||
|
image_shape: [3, 48, 192]
|
||||||
|
label_list: ['0','180']
|
||||||
|
distort: True
|
||||||
|
reader_yml: ./configs/cls/cls_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.cls_model,ClsModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.35
|
||||||
|
model_name: small
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.cls_head,ClsHead
|
||||||
|
class_dim: 2
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.cls_loss,ClsLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay
|
||||||
|
step_each_epoch: 1169
|
||||||
|
total_epoch: 100
|
|
@ -0,0 +1,13 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.cls.dataset_traversal,SimpleReader
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data/cls
|
||||||
|
label_file_path: ./train_data/cls/train.txt
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.cls.dataset_traversal,SimpleReader
|
||||||
|
img_set_dir: ./train_data/cls
|
||||||
|
label_file_path: ./train_data/cls/test.txt
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.cls.dataset_traversal,SimpleReader
|
|
@ -49,6 +49,6 @@ Optimizer:
|
||||||
PostProcess:
|
PostProcess:
|
||||||
function: ppocr.postprocess.db_postprocess,DBPostProcess
|
function: ppocr.postprocess.db_postprocess,DBPostProcess
|
||||||
thresh: 0.3
|
thresh: 0.3
|
||||||
box_thresh: 0.7
|
box_thresh: 0.6
|
||||||
max_candidates: 1000
|
max_candidates: 1000
|
||||||
unclip_ratio: 2.0
|
unclip_ratio: 1.5
|
||||||
|
|
|
@ -0,0 +1,59 @@
|
||||||
|
Global:
|
||||||
|
algorithm: DB
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 1200
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 2
|
||||||
|
save_model_dir: ./output/det_db/
|
||||||
|
save_epoch_step: 200
|
||||||
|
# evaluation is run every 5000 iterations after the 4000th iteration
|
||||||
|
eval_batch_step: [4000, 5000]
|
||||||
|
train_batch_size_per_card: 16
|
||||||
|
test_batch_size_per_card: 16
|
||||||
|
image_shape: [3, 640, 640]
|
||||||
|
reader_yml: ./configs/det/det_db_icdar15_reader.yml
|
||||||
|
pretrain_weights: ./pretrain_models/MobileNetV3_large_x0_5_pretrained/
|
||||||
|
checkpoints:
|
||||||
|
save_res_path: ./output/det_db/predicts_db.txt
|
||||||
|
save_inference_dir:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.det_model,DetModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.det_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: large
|
||||||
|
disable_se: true
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.det_db_head,DBHead
|
||||||
|
model_name: large
|
||||||
|
k: 50
|
||||||
|
inner_channels: 96
|
||||||
|
out_channels: 2
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.det_db_loss,DBLoss
|
||||||
|
balance_loss: true
|
||||||
|
main_loss_type: DiceLoss
|
||||||
|
alpha: 5
|
||||||
|
beta: 10
|
||||||
|
ohem_ratio: 3
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay_warmup
|
||||||
|
step_each_epoch: 16
|
||||||
|
total_epoch: 1200
|
||||||
|
|
||||||
|
PostProcess:
|
||||||
|
function: ppocr.postprocess.db_postprocess,DBPostProcess
|
||||||
|
thresh: 0.3
|
||||||
|
box_thresh: 0.6
|
||||||
|
max_candidates: 1000
|
||||||
|
unclip_ratio: 1.5
|
|
@ -0,0 +1,57 @@
|
||||||
|
Global:
|
||||||
|
algorithm: DB
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 1200
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 2
|
||||||
|
save_model_dir: ./output/det_r_18_vd_db/
|
||||||
|
save_epoch_step: 200
|
||||||
|
eval_batch_step: [3000, 2000]
|
||||||
|
train_batch_size_per_card: 8
|
||||||
|
test_batch_size_per_card: 1
|
||||||
|
image_shape: [3, 640, 640]
|
||||||
|
reader_yml: ./configs/det/det_db_icdar15_reader.yml
|
||||||
|
pretrain_weights: ./pretrain_models/ResNet18_vd_pretrained/
|
||||||
|
save_res_path: ./output/det_r18_vd_db/predicts_db.txt
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.det_model,DetModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.det_resnet_vd,ResNet
|
||||||
|
layers: 18
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.det_db_head,DBHead
|
||||||
|
model_name: large
|
||||||
|
k: 50
|
||||||
|
inner_channels: 256
|
||||||
|
out_channels: 2
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.det_db_loss,DBLoss
|
||||||
|
balance_loss: true
|
||||||
|
main_loss_type: DiceLoss
|
||||||
|
alpha: 5
|
||||||
|
beta: 10
|
||||||
|
ohem_ratio: 3
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay_warmup
|
||||||
|
step_each_epoch: 32
|
||||||
|
total_epoch: 1200
|
||||||
|
|
||||||
|
PostProcess:
|
||||||
|
function: ppocr.postprocess.db_postprocess,DBPostProcess
|
||||||
|
thresh: 0.3
|
||||||
|
box_thresh: 0.6
|
||||||
|
max_candidates: 1000
|
||||||
|
unclip_ratio: 1.5
|
||||||
|
|
|
@ -0,0 +1,50 @@
|
||||||
|
Global:
|
||||||
|
algorithm: SAST
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 2000
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 2
|
||||||
|
save_model_dir: ./output/det_sast/
|
||||||
|
save_epoch_step: 20
|
||||||
|
eval_batch_step: 5000
|
||||||
|
train_batch_size_per_card: 8
|
||||||
|
test_batch_size_per_card: 8
|
||||||
|
image_shape: [3, 512, 512]
|
||||||
|
reader_yml: ./configs/det/det_sast_icdar15_reader.yml
|
||||||
|
pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
|
||||||
|
save_res_path: ./output/det_sast/predicts_sast.txt
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.det_model,DetModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.det_resnet_vd_sast,ResNet
|
||||||
|
layers: 50
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.det_sast_head,SASTHead
|
||||||
|
model_name: large
|
||||||
|
only_fpn_up: False
|
||||||
|
# with_cab: False
|
||||||
|
with_cab: True
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.det_sast_loss,SASTLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,RMSProp
|
||||||
|
base_lr: 0.001
|
||||||
|
decay:
|
||||||
|
function: piecewise_decay
|
||||||
|
boundaries: [30000, 50000, 80000, 100000, 150000]
|
||||||
|
decay_rate: 0.3
|
||||||
|
|
||||||
|
PostProcess:
|
||||||
|
function: ppocr.postprocess.sast_postprocess,SASTPostProcess
|
||||||
|
score_thresh: 0.5
|
||||||
|
sample_pts_num: 2
|
||||||
|
nms_thresh: 0.2
|
||||||
|
expand_scale: 1.0
|
||||||
|
shrink_ratio_of_width: 0.3
|
|
@ -0,0 +1,50 @@
|
||||||
|
Global:
|
||||||
|
algorithm: SAST
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 2000
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 2
|
||||||
|
save_model_dir: ./output/det_sast/
|
||||||
|
save_epoch_step: 20
|
||||||
|
eval_batch_step: 5000
|
||||||
|
train_batch_size_per_card: 8
|
||||||
|
test_batch_size_per_card: 1
|
||||||
|
image_shape: [3, 512, 512]
|
||||||
|
reader_yml: ./configs/det/det_sast_totaltext_reader.yml
|
||||||
|
pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
|
||||||
|
save_res_path: ./output/det_sast/predicts_sast.txt
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.det_model,DetModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.det_resnet_vd_sast,ResNet
|
||||||
|
layers: 50
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.det_sast_head,SASTHead
|
||||||
|
model_name: large
|
||||||
|
only_fpn_up: False
|
||||||
|
# with_cab: False
|
||||||
|
with_cab: True
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.det_sast_loss,SASTLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,RMSProp
|
||||||
|
base_lr: 0.001
|
||||||
|
decay:
|
||||||
|
function: piecewise_decay
|
||||||
|
boundaries: [30000, 50000, 80000, 100000, 150000]
|
||||||
|
decay_rate: 0.3
|
||||||
|
|
||||||
|
PostProcess:
|
||||||
|
function: ppocr.postprocess.sast_postprocess,SASTPostProcess
|
||||||
|
score_thresh: 0.5
|
||||||
|
sample_pts_num: 6
|
||||||
|
nms_thresh: 0.2
|
||||||
|
expand_scale: 1.2
|
||||||
|
shrink_ratio_of_width: 0.2
|
|
@ -0,0 +1,24 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.det.dataset_traversal,TrainReader
|
||||||
|
process_function: ppocr.data.det.sast_process,SASTProcessTrain
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data/
|
||||||
|
label_file_path: [./train_data/icdar2013/train_label_json.txt, ./train_data/icdar2015/train_label_json.txt, ./train_data/icdar17_mlt_latin/train_label_json.txt, ./train_data/coco_text_icdar_4pts/train_label_json.txt]
|
||||||
|
data_ratio_list: [0.1, 0.45, 0.3, 0.15]
|
||||||
|
min_crop_side_ratio: 0.3
|
||||||
|
min_crop_size: 24
|
||||||
|
min_text_size: 4
|
||||||
|
max_text_size: 512
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
|
||||||
|
process_function: ppocr.data.det.sast_process,SASTProcessTest
|
||||||
|
img_set_dir: ./train_data/icdar2015/text_localization/
|
||||||
|
label_file_path: ./train_data/icdar2015/text_localization/test_icdar2015_label.txt
|
||||||
|
max_side_len: 1536
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
|
||||||
|
process_function: ppocr.data.det.sast_process,SASTProcessTest
|
||||||
|
infer_img: ./train_data/icdar2015/text_localization/ch4_test_images/img_11.jpg
|
||||||
|
max_side_len: 1536
|
|
@ -0,0 +1,24 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.det.dataset_traversal,TrainReader
|
||||||
|
process_function: ppocr.data.det.sast_process,SASTProcessTrain
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data/
|
||||||
|
label_file_path: [./train_data/art_latin_icdar_14pt/train_no_tt_test/train_label_json.txt, ./train_data/total_text_icdar_14pt/train_label_json.txt]
|
||||||
|
data_ratio_list: [0.5, 0.5]
|
||||||
|
min_crop_side_ratio: 0.3
|
||||||
|
min_crop_size: 24
|
||||||
|
min_text_size: 4
|
||||||
|
max_text_size: 512
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
|
||||||
|
process_function: ppocr.data.det.sast_process,SASTProcessTest
|
||||||
|
img_set_dir: ./train_data/
|
||||||
|
label_file_path: ./train_data/total_text_icdar_14pt/test_label_json.txt
|
||||||
|
max_side_len: 768
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
|
||||||
|
process_function: ppocr.data.det.sast_process,SASTProcessTest
|
||||||
|
infer_img: ./train_data/afs/total_text/Images/Test/img623.jpg
|
||||||
|
max_side_len: 768
|
|
@ -0,0 +1,52 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec_CRNN
|
||||||
|
save_epoch_step: 3
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 128
|
||||||
|
test_batch_size_per_card: 128
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: ch
|
||||||
|
character_dict_path: ./ppocr/utils/ppocr_keys_v1.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: true
|
||||||
|
use_space_char: true
|
||||||
|
reader_yml: ./configs/rec/rec_chinese_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_resnet_vd,ResNet
|
||||||
|
layers: 34
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
fc_decay: 0.00004
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 256
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
base_lr: 0.0005
|
||||||
|
l2_decay: 0.00004
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay_warmup
|
||||||
|
step_each_epoch: 254
|
||||||
|
total_epoch: 500
|
||||||
|
warmup_minibatch: 1000
|
|
@ -0,0 +1,54 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec_CRNN
|
||||||
|
save_epoch_step: 3
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 256
|
||||||
|
test_batch_size_per_card: 256
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: ch
|
||||||
|
character_dict_path: ./ppocr/utils/ppocr_keys_v1.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: true
|
||||||
|
use_space_char: true
|
||||||
|
reader_yml: ./configs/rec/rec_chinese_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: small
|
||||||
|
small_stride: [1, 2, 2, 2]
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
fc_decay: 0.00001
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 48
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
base_lr: 0.0005
|
||||||
|
l2_decay: 0.00001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay_warmup
|
||||||
|
step_each_epoch: 254
|
||||||
|
total_epoch: 500
|
||||||
|
warmup_minibatch: 1000
|
|
@ -0,0 +1,53 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/en_number
|
||||||
|
save_epoch_step: 3
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 256
|
||||||
|
test_batch_size_per_card: 256
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 30
|
||||||
|
character_type: ch
|
||||||
|
character_dict_path: ./ppocr/utils/ic15_dict.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: false
|
||||||
|
use_space_char: false
|
||||||
|
reader_yml: ./configs/rec/multi_languages/rec_en_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: small
|
||||||
|
small_stride: [1, 2, 2, 2]
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 48
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
l2_decay: 0.00001
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay_warmup
|
||||||
|
warmup_minibatch: 1000
|
||||||
|
step_each_epoch: 6530
|
||||||
|
total_epoch: 500
|
|
@ -0,0 +1,13 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/en_train.txt
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/en_eval.txt
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
|
@ -0,0 +1,52 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec_french
|
||||||
|
save_epoch_step: 1
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 256
|
||||||
|
test_batch_size_per_card: 256
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: french
|
||||||
|
character_dict_path: ./ppocr/utils/french_dict.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: true
|
||||||
|
use_space_char: false
|
||||||
|
reader_yml: ./configs/rec/multi_languages/rec_french_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: small
|
||||||
|
small_stride: [1, 2, 2, 2]
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 48
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
l2_decay: 0.00001
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay
|
||||||
|
step_each_epoch: 254
|
||||||
|
total_epoch: 500
|
|
@ -0,0 +1,13 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/french_train.txt
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/french_eval.txt
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
|
@ -0,0 +1,52 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec_german
|
||||||
|
save_epoch_step: 1
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 256
|
||||||
|
test_batch_size_per_card: 256
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: german
|
||||||
|
character_dict_path: ./ppocr/utils/german_dict.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: true
|
||||||
|
use_space_char: false
|
||||||
|
reader_yml: ./configs/rec/multi_languages/rec_ger_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: small
|
||||||
|
small_stride: [1, 2, 2, 2]
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 48
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
l2_decay: 0.00001
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay
|
||||||
|
step_each_epoch: 254
|
||||||
|
total_epoch: 500
|
|
@ -0,0 +1,13 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/de_train.txt
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/de_eval.txt
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
|
@ -0,0 +1,52 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec_japan
|
||||||
|
save_epoch_step: 1
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 256
|
||||||
|
test_batch_size_per_card: 256
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: japan
|
||||||
|
character_dict_path: ./ppocr/utils/japan_dict.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: true
|
||||||
|
use_space_char: false
|
||||||
|
reader_yml: ./configs/rec/multi_languages/rec_japan_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: small
|
||||||
|
small_stride: [1, 2, 2, 2]
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 48
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
l2_decay: 0.00001
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay
|
||||||
|
step_each_epoch: 254
|
||||||
|
total_epoch: 500
|
|
@ -0,0 +1,13 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/japan_train.txt
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/japan_eval.txt
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
|
@ -0,0 +1,52 @@
|
||||||
|
Global:
|
||||||
|
algorithm: CRNN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 500
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec_korean
|
||||||
|
save_epoch_step: 1
|
||||||
|
eval_batch_step: 2000
|
||||||
|
train_batch_size_per_card: 256
|
||||||
|
test_batch_size_per_card: 256
|
||||||
|
image_shape: [3, 32, 320]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: korean
|
||||||
|
character_dict_path: ./ppocr/utils/korean_dict.txt
|
||||||
|
loss_type: ctc
|
||||||
|
distort: true
|
||||||
|
use_space_char: false
|
||||||
|
reader_yml: ./configs/rec/multi_languages/rec_korean_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
|
||||||
|
scale: 0.5
|
||||||
|
model_name: small
|
||||||
|
small_stride: [1, 2, 2, 2]
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 48
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
l2_decay: 0.00001
|
||||||
|
base_lr: 0.001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
decay:
|
||||||
|
function: cosine_decay
|
||||||
|
step_each_epoch: 254
|
||||||
|
total_epoch: 500
|
|
@ -0,0 +1,13 @@
|
||||||
|
TrainReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
num_workers: 8
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/korean_train.txt
|
||||||
|
|
||||||
|
EvalReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
||||||
|
img_set_dir: ./train_data
|
||||||
|
label_file_path: ./train_data/korean_eval.txt
|
||||||
|
|
||||||
|
TestReader:
|
||||||
|
reader_function: ppocr.data.rec.dataset_traversal,SimpleReader
|
|
@ -0,0 +1,49 @@
|
||||||
|
Global:
|
||||||
|
algorithm: SRN
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 72
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: output/rec_pvam_withrotate
|
||||||
|
save_epoch_step: 1
|
||||||
|
eval_batch_step: 8000
|
||||||
|
train_batch_size_per_card: 64
|
||||||
|
test_batch_size_per_card: 1
|
||||||
|
image_shape: [1, 64, 256]
|
||||||
|
max_text_length: 25
|
||||||
|
character_type: en
|
||||||
|
loss_type: srn
|
||||||
|
num_heads: 8
|
||||||
|
average_window: 0.15
|
||||||
|
max_average_window: 15625
|
||||||
|
min_average_window: 10000
|
||||||
|
reader_yml: ./configs/rec/rec_benchmark_reader.yml
|
||||||
|
pretrain_weights:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
infer_img:
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
function: ppocr.modeling.architectures.rec_model,RecModel
|
||||||
|
|
||||||
|
Backbone:
|
||||||
|
function: ppocr.modeling.backbones.rec_resnet_fpn,ResNet
|
||||||
|
layers: 50
|
||||||
|
|
||||||
|
Head:
|
||||||
|
function: ppocr.modeling.heads.rec_srn_all_head,SRNPredict
|
||||||
|
encoder_type: rnn
|
||||||
|
num_encoder_TUs: 2
|
||||||
|
num_decoder_TUs: 4
|
||||||
|
hidden_dims: 512
|
||||||
|
SeqRNN:
|
||||||
|
hidden_size: 256
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
function: ppocr.modeling.losses.rec_srn_loss,SRNLoss
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
function: ppocr.optimizer,AdamDecay
|
||||||
|
base_lr: 0.0001
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
|
@ -1,6 +1,6 @@
|
||||||
# 如何快速测试
|
# 如何快速测试
|
||||||
### 1. 安装最新版本的Android Studio
|
### 1. 安装最新版本的Android Studio
|
||||||
可以从https://developer.android.com/studio下载。本Demo使用是4.0版本Android Studio编写。
|
可以从https://developer.android.com/studio 下载。本Demo使用是4.0版本Android Studio编写。
|
||||||
|
|
||||||
### 2. 按照NDK 20 以上版本
|
### 2. 按照NDK 20 以上版本
|
||||||
Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。
|
Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。
|
||||||
|
|
|
@ -3,11 +3,11 @@ import java.security.MessageDigest
|
||||||
apply plugin: 'com.android.application'
|
apply plugin: 'com.android.application'
|
||||||
|
|
||||||
android {
|
android {
|
||||||
compileSdkVersion 28
|
compileSdkVersion 29
|
||||||
defaultConfig {
|
defaultConfig {
|
||||||
applicationId "com.baidu.paddle.lite.demo.ocr"
|
applicationId "com.baidu.paddle.lite.demo.ocr"
|
||||||
minSdkVersion 15
|
minSdkVersion 23
|
||||||
targetSdkVersion 28
|
targetSdkVersion 29
|
||||||
versionCode 1
|
versionCode 1
|
||||||
versionName "1.0"
|
versionName "1.0"
|
||||||
testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
|
testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
|
||||||
|
@ -39,9 +39,8 @@ android {
|
||||||
|
|
||||||
dependencies {
|
dependencies {
|
||||||
implementation fileTree(include: ['*.jar'], dir: 'libs')
|
implementation fileTree(include: ['*.jar'], dir: 'libs')
|
||||||
implementation 'com.android.support:appcompat-v7:28.0.0'
|
implementation 'androidx.appcompat:appcompat:1.1.0'
|
||||||
implementation 'com.android.support.constraint:constraint-layout:1.1.3'
|
implementation 'androidx.constraintlayout:constraintlayout:1.1.3'
|
||||||
implementation 'com.android.support:design:28.0.0'
|
|
||||||
testImplementation 'junit:junit:4.12'
|
testImplementation 'junit:junit:4.12'
|
||||||
androidTestImplementation 'com.android.support.test:runner:1.0.2'
|
androidTestImplementation 'com.android.support.test:runner:1.0.2'
|
||||||
androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'
|
androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'
|
||||||
|
|
|
@ -14,10 +14,10 @@
|
||||||
android:roundIcon="@mipmap/ic_launcher_round"
|
android:roundIcon="@mipmap/ic_launcher_round"
|
||||||
android:supportsRtl="true"
|
android:supportsRtl="true"
|
||||||
android:theme="@style/AppTheme">
|
android:theme="@style/AppTheme">
|
||||||
|
<!-- to test MiniActivity, change this to com.baidu.paddle.lite.demo.ocr.MiniActivity -->
|
||||||
<activity android:name="com.baidu.paddle.lite.demo.ocr.MainActivity">
|
<activity android:name="com.baidu.paddle.lite.demo.ocr.MainActivity">
|
||||||
<intent-filter>
|
<intent-filter>
|
||||||
<action android:name="android.intent.action.MAIN"/>
|
<action android:name="android.intent.action.MAIN"/>
|
||||||
|
|
||||||
<category android:name="android.intent.category.LAUNCHER"/>
|
<category android:name="android.intent.category.LAUNCHER"/>
|
||||||
</intent-filter>
|
</intent-filter>
|
||||||
</activity>
|
</activity>
|
||||||
|
@ -25,6 +25,15 @@
|
||||||
android:name="com.baidu.paddle.lite.demo.ocr.SettingsActivity"
|
android:name="com.baidu.paddle.lite.demo.ocr.SettingsActivity"
|
||||||
android:label="Settings">
|
android:label="Settings">
|
||||||
</activity>
|
</activity>
|
||||||
|
<provider
|
||||||
|
android:name="androidx.core.content.FileProvider"
|
||||||
|
android:authorities="com.baidu.paddle.lite.demo.ocr.fileprovider"
|
||||||
|
android:exported="false"
|
||||||
|
android:grantUriPermissions="true">
|
||||||
|
<meta-data
|
||||||
|
android:name="android.support.FILE_PROVIDER_PATHS"
|
||||||
|
android:resource="@xml/file_paths"></meta-data>
|
||||||
|
</provider>
|
||||||
</application>
|
</application>
|
||||||
|
|
||||||
</manifest>
|
</manifest>
|
Binary file not shown.
After Width: | Height: | Size: 198 KiB |
Binary file not shown.
After Width: | Height: | Size: 171 KiB |
Binary file not shown.
After Width: | Height: | Size: 61 KiB |
|
@ -4,112 +4,111 @@
|
||||||
|
|
||||||
#include "native.h"
|
#include "native.h"
|
||||||
#include "ocr_ppredictor.h"
|
#include "ocr_ppredictor.h"
|
||||||
#include <string>
|
|
||||||
#include <algorithm>
|
#include <algorithm>
|
||||||
#include <paddle_api.h>
|
#include <paddle_api.h>
|
||||||
|
#include <string>
|
||||||
|
|
||||||
static paddle::lite_api::PowerMode str_to_cpu_mode(const std::string &cpu_mode);
|
static paddle::lite_api::PowerMode str_to_cpu_mode(const std::string &cpu_mode);
|
||||||
|
|
||||||
extern "C"
|
extern "C" JNIEXPORT jlong JNICALL
|
||||||
JNIEXPORT jlong JNICALL
|
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_init(
|
||||||
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_init(JNIEnv *env, jobject thiz,
|
JNIEnv *env, jobject thiz, jstring j_det_model_path,
|
||||||
jstring j_det_model_path,
|
jstring j_rec_model_path, jstring j_cls_model_path, jint j_thread_num,
|
||||||
jstring j_rec_model_path,
|
jstring j_cpu_mode) {
|
||||||
jint j_thread_num,
|
std::string det_model_path = jstring_to_cpp_string(env, j_det_model_path);
|
||||||
jstring j_cpu_mode) {
|
std::string rec_model_path = jstring_to_cpp_string(env, j_rec_model_path);
|
||||||
std::string det_model_path = jstring_to_cpp_string(env, j_det_model_path);
|
std::string cls_model_path = jstring_to_cpp_string(env, j_cls_model_path);
|
||||||
std::string rec_model_path = jstring_to_cpp_string(env, j_rec_model_path);
|
int thread_num = j_thread_num;
|
||||||
int thread_num = j_thread_num;
|
std::string cpu_mode = jstring_to_cpp_string(env, j_cpu_mode);
|
||||||
std::string cpu_mode = jstring_to_cpp_string(env, j_cpu_mode);
|
ppredictor::OCR_Config conf;
|
||||||
ppredictor::OCR_Config conf;
|
conf.thread_num = thread_num;
|
||||||
conf.thread_num = thread_num;
|
conf.mode = str_to_cpu_mode(cpu_mode);
|
||||||
conf.mode = str_to_cpu_mode(cpu_mode);
|
ppredictor::OCR_PPredictor *orc_predictor =
|
||||||
ppredictor::OCR_PPredictor *orc_predictor = new ppredictor::OCR_PPredictor{conf};
|
new ppredictor::OCR_PPredictor{conf};
|
||||||
orc_predictor->init_from_file(det_model_path, rec_model_path);
|
orc_predictor->init_from_file(det_model_path, rec_model_path, cls_model_path);
|
||||||
return reinterpret_cast<jlong>(orc_predictor);
|
return reinterpret_cast<jlong>(orc_predictor);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* "LITE_POWER_HIGH" 转为 paddle::lite_api::LITE_POWER_HIGH
|
* "LITE_POWER_HIGH" convert to paddle::lite_api::LITE_POWER_HIGH
|
||||||
* @param cpu_mode
|
* @param cpu_mode
|
||||||
* @return
|
* @return
|
||||||
*/
|
*/
|
||||||
static paddle::lite_api::PowerMode str_to_cpu_mode(const std::string &cpu_mode) {
|
static paddle::lite_api::PowerMode
|
||||||
static std::map<std::string, paddle::lite_api::PowerMode> cpu_mode_map{
|
str_to_cpu_mode(const std::string &cpu_mode) {
|
||||||
{"LITE_POWER_HIGH", paddle::lite_api::LITE_POWER_HIGH},
|
static std::map<std::string, paddle::lite_api::PowerMode> cpu_mode_map{
|
||||||
{"LITE_POWER_LOW", paddle::lite_api::LITE_POWER_HIGH},
|
{"LITE_POWER_HIGH", paddle::lite_api::LITE_POWER_HIGH},
|
||||||
{"LITE_POWER_FULL", paddle::lite_api::LITE_POWER_FULL},
|
{"LITE_POWER_LOW", paddle::lite_api::LITE_POWER_HIGH},
|
||||||
{"LITE_POWER_NO_BIND", paddle::lite_api::LITE_POWER_NO_BIND},
|
{"LITE_POWER_FULL", paddle::lite_api::LITE_POWER_FULL},
|
||||||
{"LITE_POWER_RAND_HIGH", paddle::lite_api::LITE_POWER_RAND_HIGH},
|
{"LITE_POWER_NO_BIND", paddle::lite_api::LITE_POWER_NO_BIND},
|
||||||
{"LITE_POWER_RAND_LOW", paddle::lite_api::LITE_POWER_RAND_LOW}
|
{"LITE_POWER_RAND_HIGH", paddle::lite_api::LITE_POWER_RAND_HIGH},
|
||||||
};
|
{"LITE_POWER_RAND_LOW", paddle::lite_api::LITE_POWER_RAND_LOW}};
|
||||||
std::string upper_key;
|
std::string upper_key;
|
||||||
std::transform(cpu_mode.cbegin(), cpu_mode.cend(), upper_key.begin(), ::toupper);
|
std::transform(cpu_mode.cbegin(), cpu_mode.cend(), upper_key.begin(),
|
||||||
auto index = cpu_mode_map.find(upper_key);
|
::toupper);
|
||||||
if (index == cpu_mode_map.end()) {
|
auto index = cpu_mode_map.find(upper_key);
|
||||||
LOGE("cpu_mode not found %s", upper_key.c_str());
|
if (index == cpu_mode_map.end()) {
|
||||||
return paddle::lite_api::LITE_POWER_HIGH;
|
LOGE("cpu_mode not found %s", upper_key.c_str());
|
||||||
} else {
|
return paddle::lite_api::LITE_POWER_HIGH;
|
||||||
return index->second;
|
} else {
|
||||||
}
|
return index->second;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
extern "C"
|
extern "C" JNIEXPORT jfloatArray JNICALL
|
||||||
JNIEXPORT jfloatArray JNICALL
|
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_forward(
|
||||||
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_forward(JNIEnv *env, jobject thiz,
|
JNIEnv *env, jobject thiz, jlong java_pointer, jfloatArray buf,
|
||||||
jlong java_pointer, jfloatArray buf,
|
jfloatArray ddims, jobject original_image) {
|
||||||
jfloatArray ddims,
|
LOGI("begin to run native forward");
|
||||||
jobject original_image) {
|
if (java_pointer == 0) {
|
||||||
LOGI("begin to run native forward");
|
LOGE("JAVA pointer is NULL");
|
||||||
if (java_pointer == 0) {
|
return cpp_array_to_jfloatarray(env, nullptr, 0);
|
||||||
LOGE("JAVA pointer is NULL");
|
}
|
||||||
return cpp_array_to_jfloatarray(env, nullptr, 0);
|
cv::Mat origin = bitmap_to_cv_mat(env, original_image);
|
||||||
}
|
if (origin.size == 0) {
|
||||||
cv::Mat origin = bitmap_to_cv_mat(env, original_image);
|
LOGE("origin bitmap cannot convert to CV Mat");
|
||||||
if (origin.size == 0) {
|
return cpp_array_to_jfloatarray(env, nullptr, 0);
|
||||||
LOGE("origin bitmap cannot convert to CV Mat");
|
}
|
||||||
return cpp_array_to_jfloatarray(env, nullptr, 0);
|
ppredictor::OCR_PPredictor *ppredictor =
|
||||||
}
|
(ppredictor::OCR_PPredictor *)java_pointer;
|
||||||
ppredictor::OCR_PPredictor *ppredictor = (ppredictor::OCR_PPredictor *) java_pointer;
|
std::vector<float> dims_float_arr = jfloatarray_to_float_vector(env, ddims);
|
||||||
std::vector<float> dims_float_arr = jfloatarray_to_float_vector(env, ddims);
|
std::vector<int64_t> dims_arr;
|
||||||
std::vector<int64_t> dims_arr;
|
dims_arr.resize(dims_float_arr.size());
|
||||||
dims_arr.resize(dims_float_arr.size());
|
std::copy(dims_float_arr.cbegin(), dims_float_arr.cend(), dims_arr.begin());
|
||||||
std::copy(dims_float_arr.cbegin(), dims_float_arr.cend(), dims_arr.begin());
|
|
||||||
|
|
||||||
// 这里值有点大,就不调用jfloatarray_to_float_vector了
|
// 这里值有点大,就不调用jfloatarray_to_float_vector了
|
||||||
int64_t buf_len = (int64_t) env->GetArrayLength(buf);
|
int64_t buf_len = (int64_t)env->GetArrayLength(buf);
|
||||||
jfloat *buf_data = env->GetFloatArrayElements(buf, JNI_FALSE);
|
jfloat *buf_data = env->GetFloatArrayElements(buf, JNI_FALSE);
|
||||||
float *data = (jfloat *) buf_data;
|
float *data = (jfloat *)buf_data;
|
||||||
std::vector<ppredictor::OCRPredictResult> results = ppredictor->infer_ocr(dims_arr, data,
|
std::vector<ppredictor::OCRPredictResult> results =
|
||||||
buf_len,
|
ppredictor->infer_ocr(dims_arr, data, buf_len, NET_OCR, origin);
|
||||||
NET_OCR, origin);
|
LOGI("infer_ocr finished with boxes %ld", results.size());
|
||||||
LOGI("infer_ocr finished with boxes %ld", results.size());
|
// 这里将std::vector<ppredictor::OCRPredictResult> 序列化成
|
||||||
// 这里将std::vector<ppredictor::OCRPredictResult> 序列化成 float数组,传输到java层再反序列化
|
// float数组,传输到java层再反序列化
|
||||||
std::vector<float> float_arr;
|
std::vector<float> float_arr;
|
||||||
for (const ppredictor::OCRPredictResult &r :results) {
|
for (const ppredictor::OCRPredictResult &r : results) {
|
||||||
float_arr.push_back(r.points.size());
|
float_arr.push_back(r.points.size());
|
||||||
float_arr.push_back(r.word_index.size());
|
float_arr.push_back(r.word_index.size());
|
||||||
float_arr.push_back(r.score);
|
float_arr.push_back(r.score);
|
||||||
for (const std::vector<int> &point : r.points) {
|
for (const std::vector<int> &point : r.points) {
|
||||||
float_arr.push_back(point.at(0));
|
float_arr.push_back(point.at(0));
|
||||||
float_arr.push_back(point.at(1));
|
float_arr.push_back(point.at(1));
|
||||||
}
|
|
||||||
for (int index: r.word_index) {
|
|
||||||
float_arr.push_back(index);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
return cpp_array_to_jfloatarray(env, float_arr.data(), float_arr.size());
|
for (int index : r.word_index) {
|
||||||
|
float_arr.push_back(index);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return cpp_array_to_jfloatarray(env, float_arr.data(), float_arr.size());
|
||||||
}
|
}
|
||||||
|
|
||||||
extern "C"
|
extern "C" JNIEXPORT void JNICALL
|
||||||
JNIEXPORT void JNICALL
|
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_release(
|
||||||
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_release(JNIEnv *env, jobject thiz,
|
JNIEnv *env, jobject thiz, jlong java_pointer) {
|
||||||
jlong java_pointer){
|
if (java_pointer == 0) {
|
||||||
if (java_pointer == 0) {
|
LOGE("JAVA pointer is NULL");
|
||||||
LOGE("JAVA pointer is NULL");
|
return;
|
||||||
return;
|
}
|
||||||
}
|
ppredictor::OCR_PPredictor *ppredictor =
|
||||||
ppredictor::OCR_PPredictor *ppredictor = (ppredictor::OCR_PPredictor *) java_pointer;
|
(ppredictor::OCR_PPredictor *)java_pointer;
|
||||||
delete ppredictor;
|
delete ppredictor;
|
||||||
}
|
}
|
|
@ -0,0 +1,46 @@
|
||||||
|
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
|
||||||
|
//
|
||||||
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
// you may not use this file except in compliance with the License.
|
||||||
|
// You may obtain a copy of the License at
|
||||||
|
//
|
||||||
|
// http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
//
|
||||||
|
// Unless required by applicable law or agreed to in writing, software
|
||||||
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
// See the License for the specific language governing permissions and
|
||||||
|
// limitations under the License.
|
||||||
|
|
||||||
|
#include "ocr_cls_process.h"
|
||||||
|
#include <cmath>
|
||||||
|
#include <cstring>
|
||||||
|
#include <fstream>
|
||||||
|
#include <iostream>
|
||||||
|
#include <iostream>
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
const std::vector<int> CLS_IMAGE_SHAPE = {3, 32, 100};
|
||||||
|
|
||||||
|
cv::Mat cls_resize_img(const cv::Mat &img) {
|
||||||
|
int imgC = CLS_IMAGE_SHAPE[0];
|
||||||
|
int imgW = CLS_IMAGE_SHAPE[2];
|
||||||
|
int imgH = CLS_IMAGE_SHAPE[1];
|
||||||
|
|
||||||
|
float ratio = float(img.cols) / float(img.rows);
|
||||||
|
int resize_w = 0;
|
||||||
|
if (ceilf(imgH * ratio) > imgW)
|
||||||
|
resize_w = imgW;
|
||||||
|
else
|
||||||
|
resize_w = int(ceilf(imgH * ratio));
|
||||||
|
|
||||||
|
cv::Mat resize_img;
|
||||||
|
cv::resize(img, resize_img, cv::Size(resize_w, imgH), 0.f, 0.f,
|
||||||
|
cv::INTER_CUBIC);
|
||||||
|
|
||||||
|
if (resize_w < imgW) {
|
||||||
|
cv::copyMakeBorder(resize_img, resize_img, 0, 0, 0, int(imgW - resize_w),
|
||||||
|
cv::BORDER_CONSTANT, {0, 0, 0});
|
||||||
|
}
|
||||||
|
return resize_img;
|
||||||
|
}
|
|
@ -0,0 +1,23 @@
|
||||||
|
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
|
||||||
|
//
|
||||||
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
// you may not use this file except in compliance with the License.
|
||||||
|
// You may obtain a copy of the License at
|
||||||
|
//
|
||||||
|
// http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
//
|
||||||
|
// Unless required by applicable law or agreed to in writing, software
|
||||||
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
// See the License for the specific language governing permissions and
|
||||||
|
// limitations under the License.
|
||||||
|
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include "common.h"
|
||||||
|
#include <opencv2/opencv.hpp>
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
extern const std::vector<int> CLS_IMAGE_SHAPE;
|
||||||
|
|
||||||
|
cv::Mat cls_resize_img(const cv::Mat &img);
|
|
@ -3,184 +3,237 @@
|
||||||
//
|
//
|
||||||
|
|
||||||
#include "ocr_ppredictor.h"
|
#include "ocr_ppredictor.h"
|
||||||
#include "preprocess.h"
|
|
||||||
#include "common.h"
|
#include "common.h"
|
||||||
#include "ocr_db_post_process.h"
|
#include "ocr_cls_process.h"
|
||||||
#include "ocr_crnn_process.h"
|
#include "ocr_crnn_process.h"
|
||||||
|
#include "ocr_db_post_process.h"
|
||||||
|
#include "preprocess.h"
|
||||||
|
|
||||||
namespace ppredictor {
|
namespace ppredictor {
|
||||||
|
|
||||||
OCR_PPredictor::OCR_PPredictor(const OCR_Config &config) : _config(config) {
|
OCR_PPredictor::OCR_PPredictor(const OCR_Config &config) : _config(config) {}
|
||||||
|
|
||||||
|
int OCR_PPredictor::init(const std::string &det_model_content,
|
||||||
|
const std::string &rec_model_content,
|
||||||
|
const std::string &cls_model_content) {
|
||||||
|
_det_predictor = std::unique_ptr<PPredictor>(
|
||||||
|
new PPredictor{_config.thread_num, NET_OCR, _config.mode});
|
||||||
|
_det_predictor->init_nb(det_model_content);
|
||||||
|
|
||||||
|
_rec_predictor = std::unique_ptr<PPredictor>(
|
||||||
|
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode});
|
||||||
|
_rec_predictor->init_nb(rec_model_content);
|
||||||
|
|
||||||
|
_cls_predictor = std::unique_ptr<PPredictor>(
|
||||||
|
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode});
|
||||||
|
_cls_predictor->init_nb(cls_model_content);
|
||||||
|
return RETURN_OK;
|
||||||
}
|
}
|
||||||
|
|
||||||
int
|
int OCR_PPredictor::init_from_file(const std::string &det_model_path,
|
||||||
OCR_PPredictor::init(const std::string &det_model_content, const std::string &rec_model_content) {
|
const std::string &rec_model_path,
|
||||||
_det_predictor = std::unique_ptr<PPredictor>(
|
const std::string &cls_model_path) {
|
||||||
new PPredictor{_config.thread_num, NET_OCR, _config.mode});
|
_det_predictor = std::unique_ptr<PPredictor>(
|
||||||
_det_predictor->init_nb(det_model_content);
|
new PPredictor{_config.thread_num, NET_OCR, _config.mode});
|
||||||
|
_det_predictor->init_from_file(det_model_path);
|
||||||
|
|
||||||
_rec_predictor = std::unique_ptr<PPredictor>(
|
_rec_predictor = std::unique_ptr<PPredictor>(
|
||||||
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode});
|
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode});
|
||||||
_rec_predictor->init_nb(rec_model_content);
|
_rec_predictor->init_from_file(rec_model_path);
|
||||||
return RETURN_OK;
|
|
||||||
}
|
|
||||||
|
|
||||||
int OCR_PPredictor::init_from_file(const std::string &det_model_path, const std::string &rec_model_path){
|
_cls_predictor = std::unique_ptr<PPredictor>(
|
||||||
_det_predictor = std::unique_ptr<PPredictor>(
|
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode});
|
||||||
new PPredictor{_config.thread_num, NET_OCR, _config.mode});
|
_cls_predictor->init_from_file(cls_model_path);
|
||||||
_det_predictor->init_from_file(det_model_path);
|
return RETURN_OK;
|
||||||
|
|
||||||
_rec_predictor = std::unique_ptr<PPredictor>(
|
|
||||||
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode});
|
|
||||||
_rec_predictor->init_from_file(rec_model_path);
|
|
||||||
return RETURN_OK;
|
|
||||||
}
|
}
|
||||||
/**
|
/**
|
||||||
* 调试用,保存第一步的框选结果
|
* for debug use, show result of First Step
|
||||||
* @param filter_boxes
|
* @param filter_boxes
|
||||||
* @param boxes
|
* @param boxes
|
||||||
* @param srcimg
|
* @param srcimg
|
||||||
*/
|
*/
|
||||||
static void visual_img(const std::vector<std::vector<std::vector<int>>> &filter_boxes,
|
static void
|
||||||
const std::vector<std::vector<std::vector<int>>> &boxes,
|
visual_img(const std::vector<std::vector<std::vector<int>>> &filter_boxes,
|
||||||
const cv::Mat &srcimg) {
|
const std::vector<std::vector<std::vector<int>>> &boxes,
|
||||||
// visualization
|
const cv::Mat &srcimg) {
|
||||||
cv::Point rook_points[filter_boxes.size()][4];
|
// visualization
|
||||||
for (int n = 0; n < filter_boxes.size(); n++) {
|
cv::Point rook_points[filter_boxes.size()][4];
|
||||||
for (int m = 0; m < filter_boxes[0].size(); m++) {
|
for (int n = 0; n < filter_boxes.size(); n++) {
|
||||||
rook_points[n][m] = cv::Point(int(filter_boxes[n][m][0]), int(filter_boxes[n][m][1]));
|
for (int m = 0; m < filter_boxes[0].size(); m++) {
|
||||||
}
|
rook_points[n][m] =
|
||||||
|
cv::Point(int(filter_boxes[n][m][0]), int(filter_boxes[n][m][1]));
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
cv::Mat img_vis;
|
cv::Mat img_vis;
|
||||||
srcimg.copyTo(img_vis);
|
srcimg.copyTo(img_vis);
|
||||||
for (int n = 0; n < boxes.size(); n++) {
|
for (int n = 0; n < boxes.size(); n++) {
|
||||||
const cv::Point *ppt[1] = {rook_points[n]};
|
const cv::Point *ppt[1] = {rook_points[n]};
|
||||||
int npt[] = {4};
|
int npt[] = {4};
|
||||||
cv::polylines(img_vis, ppt, npt, 1, 1, CV_RGB(0, 255, 0), 2, 8, 0);
|
cv::polylines(img_vis, ppt, npt, 1, 1, CV_RGB(0, 255, 0), 2, 8, 0);
|
||||||
}
|
}
|
||||||
// 调试用,自行替换需要修改的路径
|
// 调试用,自行替换需要修改的路径
|
||||||
cv::imwrite("/sdcard/1/vis.png", img_vis);
|
cv::imwrite("/sdcard/1/vis.png", img_vis);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<OCRPredictResult>
|
std::vector<OCRPredictResult>
|
||||||
OCR_PPredictor::infer_ocr(const std::vector<int64_t> &dims, const float *input_data, int input_len,
|
OCR_PPredictor::infer_ocr(const std::vector<int64_t> &dims,
|
||||||
int net_flag, cv::Mat &origin) {
|
const float *input_data, int input_len, int net_flag,
|
||||||
|
cv::Mat &origin) {
|
||||||
|
PredictorInput input = _det_predictor->get_first_input();
|
||||||
|
input.set_dims(dims);
|
||||||
|
input.set_data(input_data, input_len);
|
||||||
|
std::vector<PredictorOutput> results = _det_predictor->infer();
|
||||||
|
PredictorOutput &res = results.at(0);
|
||||||
|
std::vector<std::vector<std::vector<int>>> filtered_box = calc_filtered_boxes(
|
||||||
|
res.get_float_data(), res.get_size(), (int)dims[2], (int)dims[3], origin);
|
||||||
|
LOGI("Filter_box size %ld", filtered_box.size());
|
||||||
|
return infer_rec(filtered_box, origin);
|
||||||
|
}
|
||||||
|
|
||||||
PredictorInput input = _det_predictor->get_first_input();
|
std::vector<OCRPredictResult> OCR_PPredictor::infer_rec(
|
||||||
|
const std::vector<std::vector<std::vector<int>>> &boxes,
|
||||||
|
const cv::Mat &origin_img) {
|
||||||
|
std::vector<float> mean = {0.5f, 0.5f, 0.5f};
|
||||||
|
std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
|
||||||
|
std::vector<int64_t> dims = {1, 3, 0, 0};
|
||||||
|
std::vector<OCRPredictResult> ocr_results;
|
||||||
|
|
||||||
|
PredictorInput input = _rec_predictor->get_first_input();
|
||||||
|
for (auto bp = boxes.crbegin(); bp != boxes.crend(); ++bp) {
|
||||||
|
const std::vector<std::vector<int>> &box = *bp;
|
||||||
|
cv::Mat crop_img = get_rotate_crop_image(origin_img, box);
|
||||||
|
crop_img = infer_cls(crop_img);
|
||||||
|
|
||||||
|
float wh_ratio = float(crop_img.cols) / float(crop_img.rows);
|
||||||
|
cv::Mat input_image = crnn_resize_img(crop_img, wh_ratio);
|
||||||
|
input_image.convertTo(input_image, CV_32FC3, 1 / 255.0f);
|
||||||
|
const float *dimg = reinterpret_cast<const float *>(input_image.data);
|
||||||
|
int input_size = input_image.rows * input_image.cols;
|
||||||
|
|
||||||
|
dims[2] = input_image.rows;
|
||||||
|
dims[3] = input_image.cols;
|
||||||
input.set_dims(dims);
|
input.set_dims(dims);
|
||||||
input.set_data(input_data, input_len);
|
|
||||||
std::vector<PredictorOutput> results = _det_predictor->infer();
|
neon_mean_scale(dimg, input.get_mutable_float_data(), input_size, mean,
|
||||||
PredictorOutput &res = results.at(0);
|
scale);
|
||||||
std::vector<std::vector<std::vector<int>>> filtered_box
|
|
||||||
= calc_filtered_boxes(res.get_float_data(), res.get_size(), (int) dims[2], (int) dims[3],
|
std::vector<PredictorOutput> results = _rec_predictor->infer();
|
||||||
origin);
|
|
||||||
LOGI("Filter_box size %ld", filtered_box.size());
|
OCRPredictResult res;
|
||||||
return infer_rec(filtered_box, origin);
|
res.word_index = postprocess_rec_word_index(results.at(0));
|
||||||
|
if (res.word_index.empty()) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
res.score = postprocess_rec_score(results.at(1));
|
||||||
|
res.points = box;
|
||||||
|
ocr_results.emplace_back(std::move(res));
|
||||||
|
}
|
||||||
|
LOGI("ocr_results finished %lu", ocr_results.size());
|
||||||
|
return ocr_results;
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<OCRPredictResult>
|
cv::Mat OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) {
|
||||||
OCR_PPredictor::infer_rec(const std::vector<std::vector<std::vector<int>>> &boxes,
|
std::vector<float> mean = {0.5f, 0.5f, 0.5f};
|
||||||
const cv::Mat &origin_img) {
|
std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
|
||||||
std::vector<float> mean = {0.5f, 0.5f, 0.5f};
|
std::vector<int64_t> dims = {1, 3, 0, 0};
|
||||||
std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
|
std::vector<OCRPredictResult> ocr_results;
|
||||||
std::vector<int64_t> dims = {1, 3, 0, 0};
|
|
||||||
std::vector<OCRPredictResult> ocr_results;
|
|
||||||
|
|
||||||
PredictorInput input = _rec_predictor->get_first_input();
|
PredictorInput input = _cls_predictor->get_first_input();
|
||||||
for (auto bp = boxes.crbegin(); bp != boxes.crend(); ++bp) {
|
|
||||||
const std::vector<std::vector<int>> &box = *bp;
|
|
||||||
cv::Mat crop_img = get_rotate_crop_image(origin_img, box);
|
|
||||||
float wh_ratio = float(crop_img.cols) / float(crop_img.rows);
|
|
||||||
cv::Mat input_image = crnn_resize_img(crop_img, wh_ratio);
|
|
||||||
input_image.convertTo(input_image, CV_32FC3, 1 / 255.0f);
|
|
||||||
const float *dimg = reinterpret_cast<const float *>(input_image.data);
|
|
||||||
int input_size = input_image.rows * input_image.cols;
|
|
||||||
|
|
||||||
dims[2] = input_image.rows;
|
cv::Mat input_image = cls_resize_img(img);
|
||||||
dims[3] = input_image.cols;
|
input_image.convertTo(input_image, CV_32FC3, 1 / 255.0f);
|
||||||
input.set_dims(dims);
|
const float *dimg = reinterpret_cast<const float *>(input_image.data);
|
||||||
|
int input_size = input_image.rows * input_image.cols;
|
||||||
|
|
||||||
neon_mean_scale(dimg, input.get_mutable_float_data(), input_size, mean, scale);
|
dims[2] = input_image.rows;
|
||||||
|
dims[3] = input_image.cols;
|
||||||
|
input.set_dims(dims);
|
||||||
|
|
||||||
std::vector<PredictorOutput> results = _rec_predictor->infer();
|
neon_mean_scale(dimg, input.get_mutable_float_data(), input_size, mean,
|
||||||
|
scale);
|
||||||
|
|
||||||
OCRPredictResult res;
|
std::vector<PredictorOutput> results = _cls_predictor->infer();
|
||||||
res.word_index = postprocess_rec_word_index(results.at(0));
|
|
||||||
if (res.word_index.empty()) {
|
const float *scores = results.at(0).get_float_data();
|
||||||
continue;
|
const int *labels = results.at(1).get_int_data();
|
||||||
}
|
for (int64_t i = 0; i < results.at(0).get_size(); i++) {
|
||||||
res.score = postprocess_rec_score(results.at(1));
|
LOGI("output scores [%f]", scores[i]);
|
||||||
res.points = box;
|
}
|
||||||
ocr_results.emplace_back(std::move(res));
|
for (int64_t i = 0; i < results.at(1).get_size(); i++) {
|
||||||
}
|
LOGI("output label [%d]", labels[i]);
|
||||||
LOGI("ocr_results finished %lu", ocr_results.size());
|
}
|
||||||
return ocr_results;
|
int label_idx = labels[0];
|
||||||
|
float score = scores[label_idx];
|
||||||
|
|
||||||
|
cv::Mat srcimg;
|
||||||
|
img.copyTo(srcimg);
|
||||||
|
if (label_idx % 2 == 1 && score > thresh) {
|
||||||
|
cv::rotate(srcimg, srcimg, 1);
|
||||||
|
}
|
||||||
|
return srcimg;
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<std::vector<std::vector<int>>>
|
std::vector<std::vector<std::vector<int>>>
|
||||||
OCR_PPredictor::calc_filtered_boxes(const float *pred, int pred_size, int output_height,
|
OCR_PPredictor::calc_filtered_boxes(const float *pred, int pred_size,
|
||||||
int output_width, const cv::Mat &origin) {
|
int output_height, int output_width,
|
||||||
const double threshold = 0.3;
|
const cv::Mat &origin) {
|
||||||
const double maxvalue = 1;
|
const double threshold = 0.3;
|
||||||
|
const double maxvalue = 1;
|
||||||
|
|
||||||
cv::Mat pred_map = cv::Mat::zeros(output_height, output_width, CV_32F);
|
cv::Mat pred_map = cv::Mat::zeros(output_height, output_width, CV_32F);
|
||||||
memcpy(pred_map.data, pred, pred_size * sizeof(float));
|
memcpy(pred_map.data, pred, pred_size * sizeof(float));
|
||||||
cv::Mat cbuf_map;
|
cv::Mat cbuf_map;
|
||||||
pred_map.convertTo(cbuf_map, CV_8UC1);
|
pred_map.convertTo(cbuf_map, CV_8UC1);
|
||||||
|
|
||||||
cv::Mat bit_map;
|
cv::Mat bit_map;
|
||||||
cv::threshold(cbuf_map, bit_map, threshold, maxvalue, cv::THRESH_BINARY);
|
cv::threshold(cbuf_map, bit_map, threshold, maxvalue, cv::THRESH_BINARY);
|
||||||
|
|
||||||
std::vector<std::vector<std::vector<int>>> boxes = boxes_from_bitmap(pred_map, bit_map);
|
std::vector<std::vector<std::vector<int>>> boxes =
|
||||||
float ratio_h = output_height * 1.0f / origin.rows;
|
boxes_from_bitmap(pred_map, bit_map);
|
||||||
float ratio_w = output_width * 1.0f / origin.cols;
|
float ratio_h = output_height * 1.0f / origin.rows;
|
||||||
std::vector<std::vector<std::vector<int>>> filter_boxes = filter_tag_det_res(boxes, ratio_h,
|
float ratio_w = output_width * 1.0f / origin.cols;
|
||||||
ratio_w, origin);
|
std::vector<std::vector<std::vector<int>>> filter_boxes =
|
||||||
return filter_boxes;
|
filter_tag_det_res(boxes, ratio_h, ratio_w, origin);
|
||||||
|
return filter_boxes;
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<int> OCR_PPredictor::postprocess_rec_word_index(const PredictorOutput &res) {
|
std::vector<int>
|
||||||
const int *rec_idx = res.get_int_data();
|
OCR_PPredictor::postprocess_rec_word_index(const PredictorOutput &res) {
|
||||||
const std::vector<std::vector<uint64_t>> rec_idx_lod = res.get_lod();
|
const int *rec_idx = res.get_int_data();
|
||||||
|
const std::vector<std::vector<uint64_t>> rec_idx_lod = res.get_lod();
|
||||||
|
|
||||||
std::vector<int> pred_idx;
|
std::vector<int> pred_idx;
|
||||||
for (int n = int(rec_idx_lod[0][0]); n < int(rec_idx_lod[0][1] * 2); n += 2) {
|
for (int n = int(rec_idx_lod[0][0]); n < int(rec_idx_lod[0][1] * 2); n += 2) {
|
||||||
pred_idx.emplace_back(rec_idx[n]);
|
pred_idx.emplace_back(rec_idx[n]);
|
||||||
}
|
}
|
||||||
return pred_idx;
|
return pred_idx;
|
||||||
}
|
}
|
||||||
|
|
||||||
float OCR_PPredictor::postprocess_rec_score(const PredictorOutput &res) {
|
float OCR_PPredictor::postprocess_rec_score(const PredictorOutput &res) {
|
||||||
const float *predict_batch = res.get_float_data();
|
const float *predict_batch = res.get_float_data();
|
||||||
const std::vector<int64_t> predict_shape = res.get_shape();
|
const std::vector<int64_t> predict_shape = res.get_shape();
|
||||||
const std::vector<std::vector<uint64_t>> predict_lod = res.get_lod();
|
const std::vector<std::vector<uint64_t>> predict_lod = res.get_lod();
|
||||||
int blank = predict_shape[1];
|
int blank = predict_shape[1];
|
||||||
float score = 0.f;
|
float score = 0.f;
|
||||||
int count = 0;
|
int count = 0;
|
||||||
for (int n = predict_lod[0][0]; n < predict_lod[0][1] - 1; n++) {
|
for (int n = predict_lod[0][0]; n < predict_lod[0][1] - 1; n++) {
|
||||||
int argmax_idx = argmax(predict_batch + n * predict_shape[1],
|
int argmax_idx = argmax(predict_batch + n * predict_shape[1],
|
||||||
predict_batch + (n + 1) * predict_shape[1]);
|
predict_batch + (n + 1) * predict_shape[1]);
|
||||||
float max_value = predict_batch[n * predict_shape[1] + argmax_idx];
|
float max_value = predict_batch[n * predict_shape[1] + argmax_idx];
|
||||||
if (blank - 1 - argmax_idx > 1e-5) {
|
if (blank - 1 - argmax_idx > 1e-5) {
|
||||||
score += max_value;
|
score += max_value;
|
||||||
count += 1;
|
count += 1;
|
||||||
}
|
|
||||||
|
|
||||||
}
|
}
|
||||||
if (count == 0) {
|
}
|
||||||
LOGE("calc score count 0");
|
if (count == 0) {
|
||||||
} else {
|
LOGE("calc score count 0");
|
||||||
score /= count;
|
} else {
|
||||||
}
|
score /= count;
|
||||||
LOGI("calc score: %f", score);
|
}
|
||||||
return score;
|
LOGI("calc score: %f", score);
|
||||||
|
return score;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
NET_TYPE OCR_PPredictor::get_net_flag() const { return NET_OCR; }
|
||||||
NET_TYPE OCR_PPredictor::get_net_flag() const {
|
|
||||||
return NET_OCR;
|
|
||||||
}
|
|
||||||
}
|
}
|
|
@ -4,109 +4,119 @@
|
||||||
|
|
||||||
#pragma once
|
#pragma once
|
||||||
|
|
||||||
#include <string>
|
#include "ppredictor.h"
|
||||||
#include <opencv2/opencv.hpp>
|
#include <opencv2/opencv.hpp>
|
||||||
#include <paddle_api.h>
|
#include <paddle_api.h>
|
||||||
#include "ppredictor.h"
|
#include <string>
|
||||||
|
|
||||||
namespace ppredictor {
|
namespace ppredictor {
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 配置
|
* Config
|
||||||
*/
|
*/
|
||||||
struct OCR_Config {
|
struct OCR_Config {
|
||||||
int thread_num = 4; // 线程数
|
int thread_num = 4; // Thread num
|
||||||
paddle::lite_api::PowerMode mode = paddle::lite_api::LITE_POWER_HIGH; // PaddleLite Mode
|
paddle::lite_api::PowerMode mode =
|
||||||
|
paddle::lite_api::LITE_POWER_HIGH; // PaddleLite Mode
|
||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 一个四边形内图片的推理结果,
|
* PolyGone Result
|
||||||
*/
|
*/
|
||||||
struct OCRPredictResult {
|
struct OCRPredictResult {
|
||||||
std::vector<int> word_index; //
|
std::vector<int> word_index;
|
||||||
std::vector<std::vector<int>> points;
|
std::vector<std::vector<int>> points;
|
||||||
float score;
|
float score;
|
||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* OCR 一共有2个模型进行推理,
|
* OCR there are 2 models
|
||||||
* 1. 使用第一个模型(det),框选出多个四边形
|
* 1. First model(det),select polygones to show where are the texts
|
||||||
* 2. 从原图从抠出这些多边形,使用第二个模型(rec),获取文本
|
* 2. crop from the origin images, use these polygones to infer
|
||||||
*/
|
*/
|
||||||
class OCR_PPredictor : public PPredictor_Interface {
|
class OCR_PPredictor : public PPredictor_Interface {
|
||||||
public:
|
public:
|
||||||
OCR_PPredictor(const OCR_Config &config);
|
OCR_PPredictor(const OCR_Config &config);
|
||||||
|
|
||||||
virtual ~OCR_PPredictor() {
|
virtual ~OCR_PPredictor() {}
|
||||||
|
|
||||||
}
|
/**
|
||||||
|
* 初始化二个模型的Predictor
|
||||||
/**
|
* @param det_model_content
|
||||||
* 初始化二个模型的Predictor
|
* @param rec_model_content
|
||||||
* @param det_model_content
|
* @return
|
||||||
* @param rec_model_content
|
*/
|
||||||
* @return
|
int init(const std::string &det_model_content,
|
||||||
*/
|
const std::string &rec_model_content,
|
||||||
int init(const std::string &det_model_content, const std::string &rec_model_content);
|
const std::string &cls_model_content);
|
||||||
int init_from_file(const std::string &det_model_path, const std::string &rec_model_path);
|
int init_from_file(const std::string &det_model_path,
|
||||||
/**
|
const std::string &rec_model_path,
|
||||||
* 返回OCR结果
|
const std::string &cls_model_path);
|
||||||
* @param dims
|
/**
|
||||||
* @param input_data
|
* Return OCR result
|
||||||
* @param input_len
|
* @param dims
|
||||||
* @param net_flag
|
* @param input_data
|
||||||
* @param origin
|
* @param input_len
|
||||||
* @return
|
* @param net_flag
|
||||||
*/
|
* @param origin
|
||||||
virtual std::vector<OCRPredictResult>
|
* @return
|
||||||
infer_ocr(const std::vector<int64_t> &dims, const float *input_data, int input_len,
|
*/
|
||||||
int net_flag, cv::Mat &origin);
|
virtual std::vector<OCRPredictResult>
|
||||||
|
infer_ocr(const std::vector<int64_t> &dims, const float *input_data,
|
||||||
|
int input_len, int net_flag, cv::Mat &origin);
|
||||||
virtual NET_TYPE get_net_flag() const;
|
|
||||||
|
|
||||||
|
virtual NET_TYPE get_net_flag() const;
|
||||||
|
|
||||||
private:
|
private:
|
||||||
|
/**
|
||||||
|
* calcul Polygone from the result image of first model
|
||||||
|
* @param pred
|
||||||
|
* @param output_height
|
||||||
|
* @param output_width
|
||||||
|
* @param origin
|
||||||
|
* @return
|
||||||
|
*/
|
||||||
|
std::vector<std::vector<std::vector<int>>>
|
||||||
|
calc_filtered_boxes(const float *pred, int pred_size, int output_height,
|
||||||
|
int output_width, const cv::Mat &origin);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 从第一个模型的结果中计算有文字的四边形
|
* infer for second model
|
||||||
* @param pred
|
*
|
||||||
* @param output_height
|
* @param boxes
|
||||||
* @param output_width
|
* @param origin
|
||||||
* @param origin
|
* @return
|
||||||
* @return
|
*/
|
||||||
*/
|
std::vector<OCRPredictResult>
|
||||||
std::vector<std::vector<std::vector<int>>>
|
infer_rec(const std::vector<std::vector<std::vector<int>>> &boxes,
|
||||||
calc_filtered_boxes(const float *pred, int pred_size, int output_height, int output_width,
|
const cv::Mat &origin);
|
||||||
const cv::Mat &origin);
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 第二个模型的推理
|
* infer for cls model
|
||||||
*
|
*
|
||||||
* @param boxes
|
* @param boxes
|
||||||
* @param origin
|
* @param origin
|
||||||
* @return
|
* @return
|
||||||
*/
|
*/
|
||||||
std::vector<OCRPredictResult>
|
cv::Mat infer_cls(const cv::Mat &origin, float thresh = 0.5);
|
||||||
infer_rec(const std::vector<std::vector<std::vector<int>>> &boxes, const cv::Mat &origin);
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 第二个模型提取文字的后处理
|
* Postprocess or sencod model to extract text
|
||||||
* @param res
|
* @param res
|
||||||
* @return
|
* @return
|
||||||
*/
|
*/
|
||||||
std::vector<int> postprocess_rec_word_index(const PredictorOutput &res);
|
std::vector<int> postprocess_rec_word_index(const PredictorOutput &res);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 计算第二个模型的文字的置信度
|
* calculate confidence of second model text result
|
||||||
* @param res
|
* @param res
|
||||||
* @return
|
* @return
|
||||||
*/
|
*/
|
||||||
float postprocess_rec_score(const PredictorOutput &res);
|
float postprocess_rec_score(const PredictorOutput &res);
|
||||||
|
|
||||||
std::unique_ptr<PPredictor> _det_predictor;
|
|
||||||
std::unique_ptr<PPredictor> _rec_predictor;
|
|
||||||
OCR_Config _config;
|
|
||||||
|
|
||||||
|
std::unique_ptr<PPredictor> _det_predictor;
|
||||||
|
std::unique_ptr<PPredictor> _rec_predictor;
|
||||||
|
std::unique_ptr<PPredictor> _cls_predictor;
|
||||||
|
OCR_Config _config;
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
|
@ -7,7 +7,7 @@
|
||||||
namespace ppredictor {
|
namespace ppredictor {
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* PaddleLite Preditor 通用接口
|
* PaddleLite Preditor Common Interface
|
||||||
*/
|
*/
|
||||||
class PPredictor_Interface {
|
class PPredictor_Interface {
|
||||||
public:
|
public:
|
||||||
|
@ -21,7 +21,7 @@ public:
|
||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 通用推理
|
* Common Predictor
|
||||||
*/
|
*/
|
||||||
class PPredictor : public PPredictor_Interface {
|
class PPredictor : public PPredictor_Interface {
|
||||||
public:
|
public:
|
||||||
|
@ -33,9 +33,9 @@ public:
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 初始化paddlitelite的opt模型,nb格式,与init_paddle二选一
|
* init paddlitelite opt model,nb format ,or use ini_paddle
|
||||||
* @param model_content
|
* @param model_content
|
||||||
* @return 0 目前是固定值0, 之后其他值表示失败
|
* @return 0
|
||||||
*/
|
*/
|
||||||
virtual int init_nb(const std::string &model_content);
|
virtual int init_nb(const std::string &model_content);
|
||||||
|
|
||||||
|
|
|
@ -21,10 +21,10 @@ public:
|
||||||
const std::vector<std::vector<uint64_t>> get_lod() const;
|
const std::vector<std::vector<uint64_t>> get_lod() const;
|
||||||
const std::vector<int64_t> get_shape() const;
|
const std::vector<int64_t> get_shape() const;
|
||||||
|
|
||||||
std::vector<float> data; // 通常是float返回,与下面的data_int二选一
|
std::vector<float> data; // return float, or use data_int
|
||||||
std::vector<int> data_int; // 少数层是int返回,与 data二选一
|
std::vector<int> data_int; // several layers return int ,or use data
|
||||||
std::vector<int64_t> shape; // PaddleLite输出层的shape
|
std::vector<int64_t> shape; // PaddleLite output shape
|
||||||
std::vector<std::vector<uint64_t>> lod; // PaddleLite输出层的lod
|
std::vector<std::vector<uint64_t>> lod; // PaddleLite output lod
|
||||||
|
|
||||||
private:
|
private:
|
||||||
std::unique_ptr<const paddle::lite_api::Tensor> _tensor;
|
std::unique_ptr<const paddle::lite_api::Tensor> _tensor;
|
||||||
|
|
|
@ -19,15 +19,16 @@ package com.baidu.paddle.lite.demo.ocr;
|
||||||
import android.content.res.Configuration;
|
import android.content.res.Configuration;
|
||||||
import android.os.Bundle;
|
import android.os.Bundle;
|
||||||
import android.preference.PreferenceActivity;
|
import android.preference.PreferenceActivity;
|
||||||
import android.support.annotation.LayoutRes;
|
|
||||||
import android.support.annotation.Nullable;
|
|
||||||
import android.support.v7.app.ActionBar;
|
|
||||||
import android.support.v7.app.AppCompatDelegate;
|
|
||||||
import android.support.v7.widget.Toolbar;
|
|
||||||
import android.view.MenuInflater;
|
import android.view.MenuInflater;
|
||||||
import android.view.View;
|
import android.view.View;
|
||||||
import android.view.ViewGroup;
|
import android.view.ViewGroup;
|
||||||
|
|
||||||
|
import androidx.annotation.LayoutRes;
|
||||||
|
import androidx.annotation.Nullable;
|
||||||
|
import androidx.appcompat.app.ActionBar;
|
||||||
|
import androidx.appcompat.app.AppCompatDelegate;
|
||||||
|
import androidx.appcompat.widget.Toolbar;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* A {@link PreferenceActivity} which implements and proxies the necessary calls
|
* A {@link PreferenceActivity} which implements and proxies the necessary calls
|
||||||
* to be used with AppCompat.
|
* to be used with AppCompat.
|
||||||
|
|
|
@ -3,23 +3,22 @@ package com.baidu.paddle.lite.demo.ocr;
|
||||||
import android.Manifest;
|
import android.Manifest;
|
||||||
import android.app.ProgressDialog;
|
import android.app.ProgressDialog;
|
||||||
import android.content.ContentResolver;
|
import android.content.ContentResolver;
|
||||||
|
import android.content.Context;
|
||||||
import android.content.Intent;
|
import android.content.Intent;
|
||||||
import android.content.SharedPreferences;
|
import android.content.SharedPreferences;
|
||||||
import android.content.pm.PackageManager;
|
import android.content.pm.PackageManager;
|
||||||
import android.database.Cursor;
|
import android.database.Cursor;
|
||||||
import android.graphics.Bitmap;
|
import android.graphics.Bitmap;
|
||||||
import android.graphics.BitmapFactory;
|
import android.graphics.BitmapFactory;
|
||||||
|
import android.media.ExifInterface;
|
||||||
import android.net.Uri;
|
import android.net.Uri;
|
||||||
import android.os.Bundle;
|
import android.os.Bundle;
|
||||||
|
import android.os.Environment;
|
||||||
import android.os.Handler;
|
import android.os.Handler;
|
||||||
import android.os.HandlerThread;
|
import android.os.HandlerThread;
|
||||||
import android.os.Message;
|
import android.os.Message;
|
||||||
import android.preference.PreferenceManager;
|
import android.preference.PreferenceManager;
|
||||||
import android.provider.MediaStore;
|
import android.provider.MediaStore;
|
||||||
import android.support.annotation.NonNull;
|
|
||||||
import android.support.v4.app.ActivityCompat;
|
|
||||||
import android.support.v4.content.ContextCompat;
|
|
||||||
import android.support.v7.app.AppCompatActivity;
|
|
||||||
import android.text.method.ScrollingMovementMethod;
|
import android.text.method.ScrollingMovementMethod;
|
||||||
import android.util.Log;
|
import android.util.Log;
|
||||||
import android.view.Menu;
|
import android.view.Menu;
|
||||||
|
@ -29,9 +28,17 @@ import android.widget.ImageView;
|
||||||
import android.widget.TextView;
|
import android.widget.TextView;
|
||||||
import android.widget.Toast;
|
import android.widget.Toast;
|
||||||
|
|
||||||
|
import androidx.annotation.NonNull;
|
||||||
|
import androidx.appcompat.app.AppCompatActivity;
|
||||||
|
import androidx.core.app.ActivityCompat;
|
||||||
|
import androidx.core.content.ContextCompat;
|
||||||
|
import androidx.core.content.FileProvider;
|
||||||
|
|
||||||
import java.io.File;
|
import java.io.File;
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
import java.io.InputStream;
|
import java.io.InputStream;
|
||||||
|
import java.text.SimpleDateFormat;
|
||||||
|
import java.util.Date;
|
||||||
|
|
||||||
public class MainActivity extends AppCompatActivity {
|
public class MainActivity extends AppCompatActivity {
|
||||||
private static final String TAG = MainActivity.class.getSimpleName();
|
private static final String TAG = MainActivity.class.getSimpleName();
|
||||||
|
@ -69,6 +76,7 @@ public class MainActivity extends AppCompatActivity {
|
||||||
protected float[] inputMean = new float[]{};
|
protected float[] inputMean = new float[]{};
|
||||||
protected float[] inputStd = new float[]{};
|
protected float[] inputStd = new float[]{};
|
||||||
protected float scoreThreshold = 0.1f;
|
protected float scoreThreshold = 0.1f;
|
||||||
|
private String currentPhotoPath;
|
||||||
|
|
||||||
protected Predictor predictor = new Predictor();
|
protected Predictor predictor = new Predictor();
|
||||||
|
|
||||||
|
@ -368,18 +376,56 @@ public class MainActivity extends AppCompatActivity {
|
||||||
}
|
}
|
||||||
|
|
||||||
private void takePhoto() {
|
private void takePhoto() {
|
||||||
Intent takePhotoIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
|
Intent takePictureIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
|
||||||
if (takePhotoIntent.resolveActivity(getPackageManager()) != null) {
|
// Ensure that there's a camera activity to handle the intent
|
||||||
startActivityForResult(takePhotoIntent, TAKE_PHOTO_REQUEST_CODE);
|
if (takePictureIntent.resolveActivity(getPackageManager()) != null) {
|
||||||
|
// Create the File where the photo should go
|
||||||
|
File photoFile = null;
|
||||||
|
try {
|
||||||
|
photoFile = createImageFile();
|
||||||
|
} catch (IOException ex) {
|
||||||
|
Log.e("MainActitity", ex.getMessage(), ex);
|
||||||
|
Toast.makeText(MainActivity.this,
|
||||||
|
"Create Camera temp file failed: " + ex.getMessage(), Toast.LENGTH_SHORT).show();
|
||||||
|
}
|
||||||
|
// Continue only if the File was successfully created
|
||||||
|
if (photoFile != null) {
|
||||||
|
Log.i(TAG, "FILEPATH " + getExternalFilesDir("Pictures").getAbsolutePath());
|
||||||
|
Uri photoURI = FileProvider.getUriForFile(this,
|
||||||
|
"com.baidu.paddle.lite.demo.ocr.fileprovider",
|
||||||
|
photoFile);
|
||||||
|
currentPhotoPath = photoFile.getAbsolutePath();
|
||||||
|
takePictureIntent.putExtra(MediaStore.EXTRA_OUTPUT, photoURI);
|
||||||
|
startActivityForResult(takePictureIntent, TAKE_PHOTO_REQUEST_CODE);
|
||||||
|
Log.i(TAG, "startActivityForResult finished");
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
private File createImageFile() throws IOException {
|
||||||
|
// Create an image file name
|
||||||
|
String timeStamp = new SimpleDateFormat("yyyyMMdd_HHmmss").format(new Date());
|
||||||
|
String imageFileName = "JPEG_" + timeStamp + "_";
|
||||||
|
File storageDir = getExternalFilesDir(Environment.DIRECTORY_PICTURES);
|
||||||
|
File image = File.createTempFile(
|
||||||
|
imageFileName, /* prefix */
|
||||||
|
".bmp", /* suffix */
|
||||||
|
storageDir /* directory */
|
||||||
|
);
|
||||||
|
|
||||||
|
return image;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
|
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
|
||||||
super.onActivityResult(requestCode, resultCode, data);
|
super.onActivityResult(requestCode, resultCode, data);
|
||||||
if (resultCode == RESULT_OK && data != null) {
|
if (resultCode == RESULT_OK) {
|
||||||
switch (requestCode) {
|
switch (requestCode) {
|
||||||
case OPEN_GALLERY_REQUEST_CODE:
|
case OPEN_GALLERY_REQUEST_CODE:
|
||||||
|
if (data == null) {
|
||||||
|
break;
|
||||||
|
}
|
||||||
try {
|
try {
|
||||||
ContentResolver resolver = getContentResolver();
|
ContentResolver resolver = getContentResolver();
|
||||||
Uri uri = data.getData();
|
Uri uri = data.getData();
|
||||||
|
@ -393,9 +439,22 @@ public class MainActivity extends AppCompatActivity {
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
case TAKE_PHOTO_REQUEST_CODE:
|
case TAKE_PHOTO_REQUEST_CODE:
|
||||||
Bundle extras = data.getExtras();
|
if (currentPhotoPath != null) {
|
||||||
Bitmap image = (Bitmap) extras.get("data");
|
ExifInterface exif = null;
|
||||||
onImageChanged(image);
|
try {
|
||||||
|
exif = new ExifInterface(currentPhotoPath);
|
||||||
|
} catch (IOException e) {
|
||||||
|
e.printStackTrace();
|
||||||
|
}
|
||||||
|
int orientation = exif.getAttributeInt(ExifInterface.TAG_ORIENTATION,
|
||||||
|
ExifInterface.ORIENTATION_UNDEFINED);
|
||||||
|
Log.i(TAG, "rotation " + orientation);
|
||||||
|
Bitmap image = BitmapFactory.decodeFile(currentPhotoPath);
|
||||||
|
image = Utils.rotateBitmap(image, orientation);
|
||||||
|
onImageChanged(image);
|
||||||
|
} else {
|
||||||
|
Log.e(TAG, "currentPhotoPath is null");
|
||||||
|
}
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
break;
|
break;
|
||||||
|
|
|
@ -0,0 +1,157 @@
|
||||||
|
package com.baidu.paddle.lite.demo.ocr;
|
||||||
|
|
||||||
|
import android.graphics.Bitmap;
|
||||||
|
import android.graphics.BitmapFactory;
|
||||||
|
import android.os.Build;
|
||||||
|
import android.os.Bundle;
|
||||||
|
import android.os.Handler;
|
||||||
|
import android.os.HandlerThread;
|
||||||
|
import android.os.Message;
|
||||||
|
import android.util.Log;
|
||||||
|
import android.view.View;
|
||||||
|
import android.widget.Button;
|
||||||
|
import android.widget.ImageView;
|
||||||
|
import android.widget.TextView;
|
||||||
|
import android.widget.Toast;
|
||||||
|
|
||||||
|
import androidx.appcompat.app.AppCompatActivity;
|
||||||
|
|
||||||
|
import java.io.IOException;
|
||||||
|
import java.io.InputStream;
|
||||||
|
|
||||||
|
public class MiniActivity extends AppCompatActivity {
|
||||||
|
|
||||||
|
|
||||||
|
public static final int REQUEST_LOAD_MODEL = 0;
|
||||||
|
public static final int REQUEST_RUN_MODEL = 1;
|
||||||
|
public static final int REQUEST_UNLOAD_MODEL = 2;
|
||||||
|
public static final int RESPONSE_LOAD_MODEL_SUCCESSED = 0;
|
||||||
|
public static final int RESPONSE_LOAD_MODEL_FAILED = 1;
|
||||||
|
public static final int RESPONSE_RUN_MODEL_SUCCESSED = 2;
|
||||||
|
public static final int RESPONSE_RUN_MODEL_FAILED = 3;
|
||||||
|
|
||||||
|
private static final String TAG = "MiniActivity";
|
||||||
|
|
||||||
|
protected Handler receiver = null; // Receive messages from worker thread
|
||||||
|
protected Handler sender = null; // Send command to worker thread
|
||||||
|
protected HandlerThread worker = null; // Worker thread to load&run model
|
||||||
|
protected volatile Predictor predictor = null;
|
||||||
|
|
||||||
|
private String assetModelDirPath = "models/ocr_v1_for_cpu";
|
||||||
|
private String assetlabelFilePath = "labels/ppocr_keys_v1.txt";
|
||||||
|
|
||||||
|
private Button button;
|
||||||
|
private ImageView imageView; // image result
|
||||||
|
private TextView textView; // text result
|
||||||
|
|
||||||
|
@Override
|
||||||
|
protected void onCreate(Bundle savedInstanceState) {
|
||||||
|
super.onCreate(savedInstanceState);
|
||||||
|
setContentView(R.layout.activity_mini);
|
||||||
|
|
||||||
|
Log.i(TAG, "SHOW in Logcat");
|
||||||
|
|
||||||
|
// Prepare the worker thread for mode loading and inference
|
||||||
|
worker = new HandlerThread("Predictor Worker");
|
||||||
|
worker.start();
|
||||||
|
sender = new Handler(worker.getLooper()) {
|
||||||
|
public void handleMessage(Message msg) {
|
||||||
|
switch (msg.what) {
|
||||||
|
case REQUEST_LOAD_MODEL:
|
||||||
|
// Load model and reload test image
|
||||||
|
if (!onLoadModel()) {
|
||||||
|
runOnUiThread(new Runnable() {
|
||||||
|
@Override
|
||||||
|
public void run() {
|
||||||
|
Toast.makeText(MiniActivity.this, "Load model failed!", Toast.LENGTH_SHORT).show();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
case REQUEST_RUN_MODEL:
|
||||||
|
// Run model if model is loaded
|
||||||
|
final boolean isSuccessed = onRunModel();
|
||||||
|
runOnUiThread(new Runnable() {
|
||||||
|
@Override
|
||||||
|
public void run() {
|
||||||
|
if (isSuccessed){
|
||||||
|
onRunModelSuccessed();
|
||||||
|
}else{
|
||||||
|
Toast.makeText(MiniActivity.this, "Run model failed!", Toast.LENGTH_SHORT).show();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
sender.sendEmptyMessage(REQUEST_LOAD_MODEL); // corresponding to REQUEST_LOAD_MODEL, to call onLoadModel()
|
||||||
|
|
||||||
|
imageView = findViewById(R.id.imageView);
|
||||||
|
textView = findViewById(R.id.sample_text);
|
||||||
|
button = findViewById(R.id.button);
|
||||||
|
button.setOnClickListener(new View.OnClickListener() {
|
||||||
|
@Override
|
||||||
|
public void onClick(View v) {
|
||||||
|
sender.sendEmptyMessage(REQUEST_RUN_MODEL);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
protected void onDestroy() {
|
||||||
|
onUnloadModel();
|
||||||
|
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.JELLY_BEAN_MR2) {
|
||||||
|
worker.quitSafely();
|
||||||
|
} else {
|
||||||
|
worker.quit();
|
||||||
|
}
|
||||||
|
super.onDestroy();
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* call in onCreate, model init
|
||||||
|
*
|
||||||
|
* @return
|
||||||
|
*/
|
||||||
|
private boolean onLoadModel() {
|
||||||
|
if (predictor == null) {
|
||||||
|
predictor = new Predictor();
|
||||||
|
}
|
||||||
|
return predictor.init(this, assetModelDirPath, assetlabelFilePath);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* init engine
|
||||||
|
* call in onCreate
|
||||||
|
*
|
||||||
|
* @return
|
||||||
|
*/
|
||||||
|
private boolean onRunModel() {
|
||||||
|
try {
|
||||||
|
String assetImagePath = "images/5.jpg";
|
||||||
|
InputStream imageStream = getAssets().open(assetImagePath);
|
||||||
|
Bitmap image = BitmapFactory.decodeStream(imageStream);
|
||||||
|
// Input is Bitmap
|
||||||
|
predictor.setInputImage(image);
|
||||||
|
return predictor.isLoaded() && predictor.runModel();
|
||||||
|
} catch (IOException e) {
|
||||||
|
e.printStackTrace();
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private void onRunModelSuccessed() {
|
||||||
|
Log.i(TAG, "onRunModelSuccessed");
|
||||||
|
textView.setText(predictor.outputResult);
|
||||||
|
imageView.setImageBitmap(predictor.outputImage);
|
||||||
|
}
|
||||||
|
|
||||||
|
private void onUnloadModel() {
|
||||||
|
if (predictor != null) {
|
||||||
|
predictor.releaseModel();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
|
@ -29,16 +29,16 @@ public class OCRPredictorNative {
|
||||||
public OCRPredictorNative(Config config) {
|
public OCRPredictorNative(Config config) {
|
||||||
this.config = config;
|
this.config = config;
|
||||||
loadLibrary();
|
loadLibrary();
|
||||||
nativePointer = init(config.detModelFilename, config.recModelFilename,
|
nativePointer = init(config.detModelFilename, config.recModelFilename,config.clsModelFilename,
|
||||||
config.cpuThreadNum, config.cpuPower);
|
config.cpuThreadNum, config.cpuPower);
|
||||||
Log.i("OCRPredictorNative", "load success " + nativePointer);
|
Log.i("OCRPredictorNative", "load success " + nativePointer);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
public void release(){
|
public void release() {
|
||||||
if (nativePointer != 0){
|
if (nativePointer != 0) {
|
||||||
nativePointer = 0;
|
nativePointer = 0;
|
||||||
destory(nativePointer);
|
// destory(nativePointer);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -55,10 +55,11 @@ public class OCRPredictorNative {
|
||||||
public String cpuPower;
|
public String cpuPower;
|
||||||
public String detModelFilename;
|
public String detModelFilename;
|
||||||
public String recModelFilename;
|
public String recModelFilename;
|
||||||
|
public String clsModelFilename;
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
protected native long init(String detModelPath, String recModelPath, int threadNum, String cpuMode);
|
protected native long init(String detModelPath, String recModelPath,String clsModelPath, int threadNum, String cpuMode);
|
||||||
|
|
||||||
protected native float[] forward(long pointer, float[] buf, float[] ddims, Bitmap originalImage);
|
protected native float[] forward(long pointer, float[] buf, float[] ddims, Bitmap originalImage);
|
||||||
|
|
||||||
|
|
|
@ -38,7 +38,7 @@ public class Predictor {
|
||||||
protected float scoreThreshold = 0.1f;
|
protected float scoreThreshold = 0.1f;
|
||||||
protected Bitmap inputImage = null;
|
protected Bitmap inputImage = null;
|
||||||
protected Bitmap outputImage = null;
|
protected Bitmap outputImage = null;
|
||||||
protected String outputResult = "";
|
protected volatile String outputResult = "";
|
||||||
protected float preprocessTime = 0;
|
protected float preprocessTime = 0;
|
||||||
protected float postprocessTime = 0;
|
protected float postprocessTime = 0;
|
||||||
|
|
||||||
|
@ -46,6 +46,16 @@ public class Predictor {
|
||||||
public Predictor() {
|
public Predictor() {
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public boolean init(Context appCtx, String modelPath, String labelPath) {
|
||||||
|
isLoaded = loadModel(appCtx, modelPath, cpuThreadNum, cpuPowerMode);
|
||||||
|
if (!isLoaded) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
isLoaded = loadLabel(appCtx, labelPath);
|
||||||
|
return isLoaded;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
public boolean init(Context appCtx, String modelPath, String labelPath, int cpuThreadNum, String cpuPowerMode,
|
public boolean init(Context appCtx, String modelPath, String labelPath, int cpuThreadNum, String cpuPowerMode,
|
||||||
String inputColorFormat,
|
String inputColorFormat,
|
||||||
long[] inputShape, float[] inputMean,
|
long[] inputShape, float[] inputMean,
|
||||||
|
@ -76,11 +86,7 @@ public class Predictor {
|
||||||
Log.e(TAG, "Only BGR color format is supported.");
|
Log.e(TAG, "Only BGR color format is supported.");
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
isLoaded = loadModel(appCtx, modelPath, cpuThreadNum, cpuPowerMode);
|
boolean isLoaded = init(appCtx, modelPath, labelPath);
|
||||||
if (!isLoaded) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
isLoaded = loadLabel(appCtx, labelPath);
|
|
||||||
if (!isLoaded) {
|
if (!isLoaded) {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
@ -115,7 +121,8 @@ public class Predictor {
|
||||||
config.cpuThreadNum = cpuThreadNum;
|
config.cpuThreadNum = cpuThreadNum;
|
||||||
config.detModelFilename = realPath + File.separator + "ch_det_mv3_db_opt.nb";
|
config.detModelFilename = realPath + File.separator + "ch_det_mv3_db_opt.nb";
|
||||||
config.recModelFilename = realPath + File.separator + "ch_rec_mv3_crnn_opt.nb";
|
config.recModelFilename = realPath + File.separator + "ch_rec_mv3_crnn_opt.nb";
|
||||||
Log.e("Predictor", "model path" + config.detModelFilename + " ; " + config.recModelFilename);
|
config.clsModelFilename = realPath + File.separator + "cls_opt_arm.nb";
|
||||||
|
Log.e("Predictor", "model path" + config.detModelFilename + " ; " + config.recModelFilename + ";" + config.clsModelFilename);
|
||||||
config.cpuPower = cpuPowerMode;
|
config.cpuPower = cpuPowerMode;
|
||||||
paddlePredictor = new OCRPredictorNative(config);
|
paddlePredictor = new OCRPredictorNative(config);
|
||||||
|
|
||||||
|
@ -127,12 +134,12 @@ public class Predictor {
|
||||||
}
|
}
|
||||||
|
|
||||||
public void releaseModel() {
|
public void releaseModel() {
|
||||||
if (paddlePredictor != null){
|
if (paddlePredictor != null) {
|
||||||
paddlePredictor.release();
|
paddlePredictor.release();
|
||||||
paddlePredictor = null;
|
paddlePredictor = null;
|
||||||
}
|
}
|
||||||
isLoaded = false;
|
isLoaded = false;
|
||||||
cpuThreadNum = 4;
|
cpuThreadNum = 1;
|
||||||
cpuPowerMode = "LITE_POWER_HIGH";
|
cpuPowerMode = "LITE_POWER_HIGH";
|
||||||
modelPath = "";
|
modelPath = "";
|
||||||
modelName = "";
|
modelName = "";
|
||||||
|
@ -222,7 +229,7 @@ public class Predictor {
|
||||||
for (int i = 0; i < warmupIterNum; i++) {
|
for (int i = 0; i < warmupIterNum; i++) {
|
||||||
paddlePredictor.runImage(inputData, width, height, channels, inputImage);
|
paddlePredictor.runImage(inputData, width, height, channels, inputImage);
|
||||||
}
|
}
|
||||||
warmupIterNum = 0; // 之后不要再warm了
|
warmupIterNum = 0; // do not need warm
|
||||||
// Run inference
|
// Run inference
|
||||||
start = new Date();
|
start = new Date();
|
||||||
ArrayList<OcrResultModel> results = paddlePredictor.runImage(inputData, width, height, channels, inputImage);
|
ArrayList<OcrResultModel> results = paddlePredictor.runImage(inputData, width, height, channels, inputImage);
|
||||||
|
@ -287,9 +294,7 @@ public class Predictor {
|
||||||
if (image == null) {
|
if (image == null) {
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
// Scale image to the size of input tensor
|
this.inputImage = image.copy(Bitmap.Config.ARGB_8888, true);
|
||||||
Bitmap rgbaImage = image.copy(Bitmap.Config.ARGB_8888, true);
|
|
||||||
this.inputImage = rgbaImage;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
private ArrayList<OcrResultModel> postprocess(ArrayList<OcrResultModel> results) {
|
private ArrayList<OcrResultModel> postprocess(ArrayList<OcrResultModel> results) {
|
||||||
|
@ -310,7 +315,7 @@ public class Predictor {
|
||||||
|
|
||||||
private void drawResults(ArrayList<OcrResultModel> results) {
|
private void drawResults(ArrayList<OcrResultModel> results) {
|
||||||
StringBuffer outputResultSb = new StringBuffer("");
|
StringBuffer outputResultSb = new StringBuffer("");
|
||||||
for (int i=0;i<results.size();i++) {
|
for (int i = 0; i < results.size(); i++) {
|
||||||
OcrResultModel result = results.get(i);
|
OcrResultModel result = results.get(i);
|
||||||
StringBuilder sb = new StringBuilder("");
|
StringBuilder sb = new StringBuilder("");
|
||||||
sb.append(result.getLabel());
|
sb.append(result.getLabel());
|
||||||
|
@ -319,8 +324,8 @@ public class Predictor {
|
||||||
for (Point p : result.getPoints()) {
|
for (Point p : result.getPoints()) {
|
||||||
sb.append("(").append(p.x).append(",").append(p.y).append(") ");
|
sb.append("(").append(p.x).append(",").append(p.y).append(") ");
|
||||||
}
|
}
|
||||||
Log.i(TAG, sb.toString());
|
Log.i(TAG, sb.toString()); // show LOG in Logcat panel
|
||||||
outputResultSb.append(i+1).append(": ").append(result.getLabel()).append("\n");
|
outputResultSb.append(i + 1).append(": ").append(result.getLabel()).append("\n");
|
||||||
}
|
}
|
||||||
outputResult = outputResultSb.toString();
|
outputResult = outputResultSb.toString();
|
||||||
outputImage = inputImage;
|
outputImage = inputImage;
|
||||||
|
|
|
@ -5,7 +5,8 @@ import android.os.Bundle;
|
||||||
import android.preference.CheckBoxPreference;
|
import android.preference.CheckBoxPreference;
|
||||||
import android.preference.EditTextPreference;
|
import android.preference.EditTextPreference;
|
||||||
import android.preference.ListPreference;
|
import android.preference.ListPreference;
|
||||||
import android.support.v7.app.ActionBar;
|
|
||||||
|
import androidx.appcompat.app.ActionBar;
|
||||||
|
|
||||||
import java.util.ArrayList;
|
import java.util.ArrayList;
|
||||||
import java.util.List;
|
import java.util.List;
|
||||||
|
|
|
@ -2,6 +2,8 @@ package com.baidu.paddle.lite.demo.ocr;
|
||||||
|
|
||||||
import android.content.Context;
|
import android.content.Context;
|
||||||
import android.graphics.Bitmap;
|
import android.graphics.Bitmap;
|
||||||
|
import android.graphics.Matrix;
|
||||||
|
import android.media.ExifInterface;
|
||||||
import android.os.Environment;
|
import android.os.Environment;
|
||||||
|
|
||||||
import java.io.*;
|
import java.io.*;
|
||||||
|
@ -110,4 +112,48 @@ public class Utils {
|
||||||
}
|
}
|
||||||
return Bitmap.createScaledBitmap(bitmap, newWidth, newHeight, true);
|
return Bitmap.createScaledBitmap(bitmap, newWidth, newHeight, true);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public static Bitmap rotateBitmap(Bitmap bitmap, int orientation) {
|
||||||
|
|
||||||
|
Matrix matrix = new Matrix();
|
||||||
|
switch (orientation) {
|
||||||
|
case ExifInterface.ORIENTATION_NORMAL:
|
||||||
|
return bitmap;
|
||||||
|
case ExifInterface.ORIENTATION_FLIP_HORIZONTAL:
|
||||||
|
matrix.setScale(-1, 1);
|
||||||
|
break;
|
||||||
|
case ExifInterface.ORIENTATION_ROTATE_180:
|
||||||
|
matrix.setRotate(180);
|
||||||
|
break;
|
||||||
|
case ExifInterface.ORIENTATION_FLIP_VERTICAL:
|
||||||
|
matrix.setRotate(180);
|
||||||
|
matrix.postScale(-1, 1);
|
||||||
|
break;
|
||||||
|
case ExifInterface.ORIENTATION_TRANSPOSE:
|
||||||
|
matrix.setRotate(90);
|
||||||
|
matrix.postScale(-1, 1);
|
||||||
|
break;
|
||||||
|
case ExifInterface.ORIENTATION_ROTATE_90:
|
||||||
|
matrix.setRotate(90);
|
||||||
|
break;
|
||||||
|
case ExifInterface.ORIENTATION_TRANSVERSE:
|
||||||
|
matrix.setRotate(-90);
|
||||||
|
matrix.postScale(-1, 1);
|
||||||
|
break;
|
||||||
|
case ExifInterface.ORIENTATION_ROTATE_270:
|
||||||
|
matrix.setRotate(-90);
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
return bitmap;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
Bitmap bmRotated = Bitmap.createBitmap(bitmap, 0, 0, bitmap.getWidth(), bitmap.getHeight(), matrix, true);
|
||||||
|
bitmap.recycle();
|
||||||
|
return bmRotated;
|
||||||
|
}
|
||||||
|
catch (OutOfMemoryError e) {
|
||||||
|
e.printStackTrace();
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
<?xml version="1.0" encoding="utf-8"?>
|
<?xml version="1.0" encoding="utf-8"?>
|
||||||
<android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
|
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
|
||||||
xmlns:app="http://schemas.android.com/apk/res-auto"
|
xmlns:app="http://schemas.android.com/apk/res-auto"
|
||||||
xmlns:tools="http://schemas.android.com/tools"
|
xmlns:tools="http://schemas.android.com/tools"
|
||||||
android:layout_width="match_parent"
|
android:layout_width="match_parent"
|
||||||
|
@ -96,4 +96,4 @@
|
||||||
|
|
||||||
</RelativeLayout>
|
</RelativeLayout>
|
||||||
|
|
||||||
</android.support.constraint.ConstraintLayout>
|
</androidx.constraintlayout.widget.ConstraintLayout>
|
|
@ -0,0 +1,46 @@
|
||||||
|
<?xml version="1.0" encoding="utf-8"?>
|
||||||
|
<!-- for MiniActivity Use Only -->
|
||||||
|
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
|
||||||
|
xmlns:app="http://schemas.android.com/apk/res-auto"
|
||||||
|
xmlns:tools="http://schemas.android.com/tools"
|
||||||
|
android:layout_width="match_parent"
|
||||||
|
android:layout_height="match_parent"
|
||||||
|
app:layout_constraintLeft_toLeftOf="parent"
|
||||||
|
app:layout_constraintLeft_toRightOf="parent"
|
||||||
|
tools:context=".MainActivity">
|
||||||
|
|
||||||
|
<TextView
|
||||||
|
android:id="@+id/sample_text"
|
||||||
|
android:layout_width="0dp"
|
||||||
|
android:layout_height="wrap_content"
|
||||||
|
android:text="Hello World!"
|
||||||
|
app:layout_constraintLeft_toLeftOf="parent"
|
||||||
|
app:layout_constraintRight_toRightOf="parent"
|
||||||
|
app:layout_constraintTop_toBottomOf="@id/imageView"
|
||||||
|
android:scrollbars="vertical"
|
||||||
|
/>
|
||||||
|
|
||||||
|
<ImageView
|
||||||
|
android:id="@+id/imageView"
|
||||||
|
android:layout_width="wrap_content"
|
||||||
|
android:layout_height="wrap_content"
|
||||||
|
android:paddingTop="20dp"
|
||||||
|
android:paddingBottom="20dp"
|
||||||
|
app:layout_constraintBottom_toTopOf="@id/imageView"
|
||||||
|
app:layout_constraintLeft_toLeftOf="parent"
|
||||||
|
app:layout_constraintRight_toRightOf="parent"
|
||||||
|
app:layout_constraintTop_toTopOf="parent"
|
||||||
|
tools:srcCompat="@tools:sample/avatars" />
|
||||||
|
|
||||||
|
<Button
|
||||||
|
android:id="@+id/button"
|
||||||
|
android:layout_width="wrap_content"
|
||||||
|
android:layout_height="wrap_content"
|
||||||
|
android:layout_marginBottom="4dp"
|
||||||
|
android:text="Button"
|
||||||
|
app:layout_constraintBottom_toBottomOf="parent"
|
||||||
|
app:layout_constraintLeft_toLeftOf="parent"
|
||||||
|
app:layout_constraintRight_toRightOf="parent"
|
||||||
|
tools:layout_editor_absoluteX="161dp" />
|
||||||
|
|
||||||
|
</androidx.constraintlayout.widget.ConstraintLayout>
|
|
@ -0,0 +1,4 @@
|
||||||
|
<?xml version="1.0" encoding="utf-8"?>
|
||||||
|
<paths xmlns:android="http://schemas.android.com/apk/res/android">
|
||||||
|
<external-files-path name="my_images" path="Pictures" />
|
||||||
|
</paths>
|
|
@ -1,4 +1,4 @@
|
||||||
#Thu Aug 22 15:05:37 CST 2019
|
#Wed Jul 22 23:48:44 CST 2020
|
||||||
distributionBase=GRADLE_USER_HOME
|
distributionBase=GRADLE_USER_HOME
|
||||||
distributionPath=wrapper/dists
|
distributionPath=wrapper/dists
|
||||||
zipStoreBase=GRADLE_USER_HOME
|
zipStoreBase=GRADLE_USER_HOME
|
||||||
|
|
|
@ -1,8 +1,17 @@
|
||||||
project(ocr_system CXX C)
|
project(ocr_system CXX C)
|
||||||
|
|
||||||
option(WITH_MKL "Compile demo with MKL/OpenBlas support, default use MKL." ON)
|
option(WITH_MKL "Compile demo with MKL/OpenBlas support, default use MKL." ON)
|
||||||
option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." OFF)
|
option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." OFF)
|
||||||
option(WITH_STATIC_LIB "Compile demo with static/shared library, default use static." ON)
|
option(WITH_STATIC_LIB "Compile demo with static/shared library, default use static." ON)
|
||||||
option(USE_TENSORRT "Compile demo with TensorRT." OFF)
|
option(WITH_TENSORRT "Compile demo with TensorRT." OFF)
|
||||||
|
|
||||||
|
SET(PADDLE_LIB "" CACHE PATH "Location of libraries")
|
||||||
|
SET(OPENCV_DIR "" CACHE PATH "Location of libraries")
|
||||||
|
SET(CUDA_LIB "" CACHE PATH "Location of libraries")
|
||||||
|
SET(CUDNN_LIB "" CACHE PATH "Location of libraries")
|
||||||
|
SET(TENSORRT_DIR "" CACHE PATH "Compile demo with TensorRT")
|
||||||
|
|
||||||
|
set(DEMO_NAME "ocr_system")
|
||||||
|
|
||||||
|
|
||||||
macro(safe_set_static_flag)
|
macro(safe_set_static_flag)
|
||||||
|
@ -15,24 +24,60 @@ macro(safe_set_static_flag)
|
||||||
endforeach(flag_var)
|
endforeach(flag_var)
|
||||||
endmacro()
|
endmacro()
|
||||||
|
|
||||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -g -fpermissive")
|
if (WITH_MKL)
|
||||||
set(CMAKE_STATIC_LIBRARY_PREFIX "")
|
ADD_DEFINITIONS(-DUSE_MKL)
|
||||||
message("flags" ${CMAKE_CXX_FLAGS})
|
endif()
|
||||||
set(CMAKE_CXX_FLAGS_RELEASE "-O3")
|
|
||||||
|
|
||||||
if(NOT DEFINED PADDLE_LIB)
|
if(NOT DEFINED PADDLE_LIB)
|
||||||
message(FATAL_ERROR "please set PADDLE_LIB with -DPADDLE_LIB=/path/paddle/lib")
|
message(FATAL_ERROR "please set PADDLE_LIB with -DPADDLE_LIB=/path/paddle/lib")
|
||||||
endif()
|
endif()
|
||||||
if(NOT DEFINED DEMO_NAME)
|
|
||||||
message(FATAL_ERROR "please set DEMO_NAME with -DDEMO_NAME=demo_name")
|
if(NOT DEFINED OPENCV_DIR)
|
||||||
|
message(FATAL_ERROR "please set OPENCV_DIR with -DOPENCV_DIR=/path/opencv")
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
|
|
||||||
set(OPENCV_DIR ${OPENCV_DIR})
|
if (WIN32)
|
||||||
find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/share/OpenCV NO_DEFAULT_PATH)
|
include_directories("${PADDLE_LIB}/paddle/fluid/inference")
|
||||||
|
include_directories("${PADDLE_LIB}/paddle/include")
|
||||||
|
link_directories("${PADDLE_LIB}/paddle/fluid/inference")
|
||||||
|
find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/build/ NO_DEFAULT_PATH)
|
||||||
|
|
||||||
|
else ()
|
||||||
|
find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/share/OpenCV NO_DEFAULT_PATH)
|
||||||
|
include_directories("${PADDLE_LIB}/paddle/include")
|
||||||
|
link_directories("${PADDLE_LIB}/paddle/lib")
|
||||||
|
endif ()
|
||||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||||
|
|
||||||
include_directories("${PADDLE_LIB}/paddle/include")
|
if (WIN32)
|
||||||
|
add_definitions("/DGOOGLE_GLOG_DLL_DECL=")
|
||||||
|
set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} /bigobj /MTd")
|
||||||
|
set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} /bigobj /MT")
|
||||||
|
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /bigobj /MTd")
|
||||||
|
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /bigobj /MT")
|
||||||
|
if (WITH_STATIC_LIB)
|
||||||
|
safe_set_static_flag()
|
||||||
|
add_definitions(-DSTATIC_LIB)
|
||||||
|
endif()
|
||||||
|
else()
|
||||||
|
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -o3 -std=c++11")
|
||||||
|
set(CMAKE_STATIC_LIBRARY_PREFIX "")
|
||||||
|
endif()
|
||||||
|
message("flags" ${CMAKE_CXX_FLAGS})
|
||||||
|
|
||||||
|
|
||||||
|
if (WITH_GPU)
|
||||||
|
if (NOT DEFINED CUDA_LIB OR ${CUDA_LIB} STREQUAL "")
|
||||||
|
message(FATAL_ERROR "please set CUDA_LIB with -DCUDA_LIB=/path/cuda-8.0/lib64")
|
||||||
|
endif()
|
||||||
|
if (NOT WIN32)
|
||||||
|
if (NOT DEFINED CUDNN_LIB)
|
||||||
|
message(FATAL_ERROR "please set CUDNN_LIB with -DCUDNN_LIB=/path/cudnn_v7.4/cuda/lib64")
|
||||||
|
endif()
|
||||||
|
endif(NOT WIN32)
|
||||||
|
endif()
|
||||||
|
|
||||||
include_directories("${PADDLE_LIB}/third_party/install/protobuf/include")
|
include_directories("${PADDLE_LIB}/third_party/install/protobuf/include")
|
||||||
include_directories("${PADDLE_LIB}/third_party/install/glog/include")
|
include_directories("${PADDLE_LIB}/third_party/install/glog/include")
|
||||||
include_directories("${PADDLE_LIB}/third_party/install/gflags/include")
|
include_directories("${PADDLE_LIB}/third_party/install/gflags/include")
|
||||||
|
@ -43,10 +88,12 @@ include_directories("${PADDLE_LIB}/third_party/eigen3")
|
||||||
|
|
||||||
include_directories("${CMAKE_SOURCE_DIR}/")
|
include_directories("${CMAKE_SOURCE_DIR}/")
|
||||||
|
|
||||||
if (USE_TENSORRT AND WITH_GPU)
|
if (NOT WIN32)
|
||||||
include_directories("${TENSORRT_ROOT}/include")
|
if (WITH_TENSORRT AND WITH_GPU)
|
||||||
link_directories("${TENSORRT_ROOT}/lib")
|
include_directories("${TENSORRT_DIR}/include")
|
||||||
endif()
|
link_directories("${TENSORRT_DIR}/lib")
|
||||||
|
endif()
|
||||||
|
endif(NOT WIN32)
|
||||||
|
|
||||||
link_directories("${PADDLE_LIB}/third_party/install/zlib/lib")
|
link_directories("${PADDLE_LIB}/third_party/install/zlib/lib")
|
||||||
|
|
||||||
|
@ -57,17 +104,24 @@ link_directories("${PADDLE_LIB}/third_party/install/xxhash/lib")
|
||||||
link_directories("${PADDLE_LIB}/paddle/lib")
|
link_directories("${PADDLE_LIB}/paddle/lib")
|
||||||
|
|
||||||
|
|
||||||
AUX_SOURCE_DIRECTORY(./src SRCS)
|
|
||||||
add_executable(${DEMO_NAME} ${SRCS})
|
|
||||||
|
|
||||||
if(WITH_MKL)
|
if(WITH_MKL)
|
||||||
include_directories("${PADDLE_LIB}/third_party/install/mklml/include")
|
include_directories("${PADDLE_LIB}/third_party/install/mklml/include")
|
||||||
set(MATH_LIB ${PADDLE_LIB}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX}
|
if (WIN32)
|
||||||
${PADDLE_LIB}/third_party/install/mklml/lib/libiomp5${CMAKE_SHARED_LIBRARY_SUFFIX})
|
set(MATH_LIB ${PADDLE_LIB}/third_party/install/mklml/lib/mklml.lib
|
||||||
|
${PADDLE_LIB}/third_party/install/mklml/lib/libiomp5md.lib)
|
||||||
|
else ()
|
||||||
|
set(MATH_LIB ${PADDLE_LIB}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX}
|
||||||
|
${PADDLE_LIB}/third_party/install/mklml/lib/libiomp5${CMAKE_SHARED_LIBRARY_SUFFIX})
|
||||||
|
execute_process(COMMAND cp -r ${PADDLE_LIB}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX} /usr/lib)
|
||||||
|
endif ()
|
||||||
set(MKLDNN_PATH "${PADDLE_LIB}/third_party/install/mkldnn")
|
set(MKLDNN_PATH "${PADDLE_LIB}/third_party/install/mkldnn")
|
||||||
if(EXISTS ${MKLDNN_PATH})
|
if(EXISTS ${MKLDNN_PATH})
|
||||||
include_directories("${MKLDNN_PATH}/include")
|
include_directories("${MKLDNN_PATH}/include")
|
||||||
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/libmkldnn.so.0)
|
if (WIN32)
|
||||||
|
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/mkldnn.lib)
|
||||||
|
else ()
|
||||||
|
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/libmkldnn.so.0)
|
||||||
|
endif ()
|
||||||
endif()
|
endif()
|
||||||
else()
|
else()
|
||||||
set(MATH_LIB ${PADDLE_LIB}/third_party/install/openblas/lib/libopenblas${CMAKE_STATIC_LIBRARY_SUFFIX})
|
set(MATH_LIB ${PADDLE_LIB}/third_party/install/openblas/lib/libopenblas${CMAKE_STATIC_LIBRARY_SUFFIX})
|
||||||
|
@ -82,24 +136,66 @@ else()
|
||||||
${PADDLE_LIB}/paddle/lib/libpaddle_fluid${CMAKE_SHARED_LIBRARY_SUFFIX})
|
${PADDLE_LIB}/paddle/lib/libpaddle_fluid${CMAKE_SHARED_LIBRARY_SUFFIX})
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
set(EXTERNAL_LIB "-lrt -ldl -lpthread -lm")
|
if (NOT WIN32)
|
||||||
|
set(DEPS ${DEPS}
|
||||||
|
${MATH_LIB} ${MKLDNN_LIB}
|
||||||
|
glog gflags protobuf z xxhash
|
||||||
|
)
|
||||||
|
if(EXISTS "${PADDLE_LIB}/third_party/install/snappystream/lib")
|
||||||
|
set(DEPS ${DEPS} snappystream)
|
||||||
|
endif()
|
||||||
|
if (EXISTS "${PADDLE_LIB}/third_party/install/snappy/lib")
|
||||||
|
set(DEPS ${DEPS} snappy)
|
||||||
|
endif()
|
||||||
|
else()
|
||||||
|
set(DEPS ${DEPS}
|
||||||
|
${MATH_LIB} ${MKLDNN_LIB}
|
||||||
|
glog gflags_static libprotobuf xxhash)
|
||||||
|
set(DEPS ${DEPS} libcmt shlwapi)
|
||||||
|
if (EXISTS "${PADDLE_LIB}/third_party/install/snappy/lib")
|
||||||
|
set(DEPS ${DEPS} snappy)
|
||||||
|
endif()
|
||||||
|
if(EXISTS "${PADDLE_LIB}/third_party/install/snappystream/lib")
|
||||||
|
set(DEPS ${DEPS} snappystream)
|
||||||
|
endif()
|
||||||
|
endif(NOT WIN32)
|
||||||
|
|
||||||
set(DEPS ${DEPS}
|
|
||||||
${MATH_LIB} ${MKLDNN_LIB}
|
|
||||||
glog gflags protobuf z xxhash
|
|
||||||
${EXTERNAL_LIB} ${OpenCV_LIBS})
|
|
||||||
|
|
||||||
if(WITH_GPU)
|
if(WITH_GPU)
|
||||||
if (USE_TENSORRT)
|
if(NOT WIN32)
|
||||||
set(DEPS ${DEPS}
|
if (WITH_TENSORRT)
|
||||||
${TENSORRT_ROOT}/lib/libnvinfer${CMAKE_SHARED_LIBRARY_SUFFIX})
|
set(DEPS ${DEPS} ${TENSORRT_DIR}/lib/libnvinfer${CMAKE_SHARED_LIBRARY_SUFFIX})
|
||||||
set(DEPS ${DEPS}
|
set(DEPS ${DEPS} ${TENSORRT_DIR}/lib/libnvinfer_plugin${CMAKE_SHARED_LIBRARY_SUFFIX})
|
||||||
${TENSORRT_ROOT}/lib/libnvinfer_plugin${CMAKE_SHARED_LIBRARY_SUFFIX})
|
endif()
|
||||||
|
set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX})
|
||||||
|
set(DEPS ${DEPS} ${CUDNN_LIB}/libcudnn${CMAKE_SHARED_LIBRARY_SUFFIX})
|
||||||
|
else()
|
||||||
|
set(DEPS ${DEPS} ${CUDA_LIB}/cudart${CMAKE_STATIC_LIBRARY_SUFFIX} )
|
||||||
|
set(DEPS ${DEPS} ${CUDA_LIB}/cublas${CMAKE_STATIC_LIBRARY_SUFFIX} )
|
||||||
|
set(DEPS ${DEPS} ${CUDNN_LIB}/cudnn${CMAKE_STATIC_LIBRARY_SUFFIX})
|
||||||
endif()
|
endif()
|
||||||
set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX})
|
|
||||||
set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX} )
|
|
||||||
set(DEPS ${DEPS} ${CUDA_LIB}/libcublas${CMAKE_SHARED_LIBRARY_SUFFIX} )
|
|
||||||
set(DEPS ${DEPS} ${CUDNN_LIB}/libcudnn${CMAKE_SHARED_LIBRARY_SUFFIX} )
|
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
|
|
||||||
|
if (NOT WIN32)
|
||||||
|
set(EXTERNAL_LIB "-ldl -lrt -lgomp -lz -lm -lpthread")
|
||||||
|
set(DEPS ${DEPS} ${EXTERNAL_LIB})
|
||||||
|
endif()
|
||||||
|
|
||||||
|
set(DEPS ${DEPS} ${OpenCV_LIBS})
|
||||||
|
|
||||||
|
AUX_SOURCE_DIRECTORY(./src SRCS)
|
||||||
|
add_executable(${DEMO_NAME} ${SRCS})
|
||||||
|
|
||||||
target_link_libraries(${DEMO_NAME} ${DEPS})
|
target_link_libraries(${DEMO_NAME} ${DEPS})
|
||||||
|
|
||||||
|
if (WIN32 AND WITH_MKL)
|
||||||
|
add_custom_command(TARGET ${DEMO_NAME} POST_BUILD
|
||||||
|
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_LIB}/third_party/install/mklml/lib/mklml.dll ./mklml.dll
|
||||||
|
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_LIB}/third_party/install/mklml/lib/libiomp5md.dll ./libiomp5md.dll
|
||||||
|
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_LIB}/third_party/install/mkldnn/lib/mkldnn.dll ./mkldnn.dll
|
||||||
|
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_LIB}/third_party/install/mklml/lib/mklml.dll ./release/mklml.dll
|
||||||
|
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_LIB}/third_party/install/mklml/lib/libiomp5md.dll ./release/libiomp5md.dll
|
||||||
|
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_LIB}/third_party/install/mkldnn/lib/mkldnn.dll ./release/mkldnn.dll
|
||||||
|
)
|
||||||
|
endif()
|
|
@ -0,0 +1,95 @@
|
||||||
|
# Visual Studio 2019 Community CMake 编译指南
|
||||||
|
|
||||||
|
PaddleOCR在Windows 平台下基于`Visual Studio 2019 Community` 进行了测试。微软从`Visual Studio 2017`开始即支持直接管理`CMake`跨平台编译项目,但是直到`2019`才提供了稳定和完全的支持,所以如果你想使用CMake管理项目编译构建,我们推荐你使用`Visual Studio 2019`环境下构建。
|
||||||
|
|
||||||
|
|
||||||
|
## 前置条件
|
||||||
|
* Visual Studio 2019
|
||||||
|
* CUDA 9.0 / CUDA 10.0,cudnn 7+ (仅在使用GPU版本的预测库时需要)
|
||||||
|
* CMake 3.0+
|
||||||
|
|
||||||
|
请确保系统已经安装好上述基本软件,我们使用的是`VS2019`的社区版。
|
||||||
|
|
||||||
|
**下面所有示例以工作目录为 `D:\projects`演示**。
|
||||||
|
|
||||||
|
### Step1: 下载PaddlePaddle C++ 预测库 fluid_inference
|
||||||
|
|
||||||
|
PaddlePaddle C++ 预测库针对不同的`CPU`和`CUDA`版本提供了不同的预编译版本,请根据实际情况下载: [C++预测库下载列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/windows_cpp_inference.html)
|
||||||
|
|
||||||
|
解压后`D:\projects\fluid_inference`目录包含内容为:
|
||||||
|
```
|
||||||
|
fluid_inference
|
||||||
|
├── paddle # paddle核心库和头文件
|
||||||
|
|
|
||||||
|
├── third_party # 第三方依赖库和头文件
|
||||||
|
|
|
||||||
|
└── version.txt # 版本和编译信息
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step2: 安装配置OpenCV
|
||||||
|
|
||||||
|
1. 在OpenCV官网下载适用于Windows平台的3.4.6版本, [下载地址](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)
|
||||||
|
2. 运行下载的可执行文件,将OpenCV解压至指定目录,如`D:\projects\opencv`
|
||||||
|
3. 配置环境变量,如下流程所示
|
||||||
|
- 我的电脑->属性->高级系统设置->环境变量
|
||||||
|
- 在系统变量中找到Path(如没有,自行创建),并双击编辑
|
||||||
|
- 新建,将opencv路径填入并保存,如`D:\projects\opencv\build\x64\vc14\bin`
|
||||||
|
|
||||||
|
### Step3: 使用Visual Studio 2019直接编译CMake
|
||||||
|
|
||||||
|
1. 打开Visual Studio 2019 Community,点击`继续但无需代码`
|
||||||
|

|
||||||
|
2. 点击: `文件`->`打开`->`CMake`
|
||||||
|

|
||||||
|
|
||||||
|
选择项目代码所在路径,并打开`CMakeList.txt`:
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
3. 点击:`项目`->`cpp_inference_demo的CMake设置`
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
4. 点击`浏览`,分别设置编译选项指定`CUDA`、`CUDNN_LIB`、`OpenCV`、`Paddle预测库`的路径
|
||||||
|
|
||||||
|
三个编译参数的含义说明如下(带`*`表示仅在使用**GPU版本**预测库时指定, 其中CUDA库版本尽量对齐,**使用9.0、10.0版本,不使用9.2、10.1等版本CUDA库**):
|
||||||
|
|
||||||
|
| 参数名 | 含义 |
|
||||||
|
| ---- | ---- |
|
||||||
|
| *CUDA_LIB | CUDA的库路径 |
|
||||||
|
| *CUDNN_LIB | CUDNN的库路径 |
|
||||||
|
| OPENCV_DIR | OpenCV的安装路径 |
|
||||||
|
| PADDLE_LIB | Paddle预测库的路径 |
|
||||||
|
|
||||||
|
**注意:**
|
||||||
|
1. 使用`CPU`版预测库,请把`WITH_GPU`的勾去掉
|
||||||
|
2. 如果使用的是`openblas`版本,请把`WITH_MKL`勾去掉
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
**设置完成后**, 点击上图中`保存并生成CMake缓存以加载变量`。
|
||||||
|
|
||||||
|
5. 点击`生成`->`全部生成`
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
|
### Step4: 预测及可视化
|
||||||
|
|
||||||
|
上述`Visual Studio 2019`编译产出的可执行文件在`out\build\x64-Release`目录下,打开`cmd`,并切换到该目录:
|
||||||
|
|
||||||
|
```
|
||||||
|
cd D:\projects\PaddleOCR\deploy\cpp_infer\out\build\x64-Release
|
||||||
|
```
|
||||||
|
可执行文件`ocr_system.exe`即为样例的预测程序,其主要使用方法如下
|
||||||
|
|
||||||
|
```shell
|
||||||
|
#预测图片 `D:\projects\PaddleOCR\doc\imgs\10.jpg`
|
||||||
|
.\ocr_system.exe D:\projects\PaddleOCR\deploy\cpp_infer\tools\config.txt D:\projects\PaddleOCR\doc\imgs\10.jpg
|
||||||
|
```
|
||||||
|
|
||||||
|
第一个参数为配置文件路径,第二个参数为需要预测的图片路径。
|
||||||
|
|
||||||
|
|
||||||
|
### 注意
|
||||||
|
* 在Windows下的终端中执行文件exe时,可能会发生乱码的现象,此时需要在终端中输入`CHCP 65001`,将终端的编码方式由GBK编码(默认)改为UTF-8编码,更加具体的解释可以参考这篇博客:[https://blog.csdn.net/qq_35038153/article/details/78430359](https://blog.csdn.net/qq_35038153/article/details/78430359)。
|
|
@ -41,13 +41,15 @@ public:
|
||||||
|
|
||||||
this->use_mkldnn = bool(stoi(config_map_["use_mkldnn"]));
|
this->use_mkldnn = bool(stoi(config_map_["use_mkldnn"]));
|
||||||
|
|
||||||
|
this->use_zero_copy_run = bool(stoi(config_map_["use_zero_copy_run"]));
|
||||||
|
|
||||||
this->max_side_len = stoi(config_map_["max_side_len"]);
|
this->max_side_len = stoi(config_map_["max_side_len"]);
|
||||||
|
|
||||||
this->det_db_thresh = stod(config_map_["det_db_thresh"]);
|
this->det_db_thresh = stod(config_map_["det_db_thresh"]);
|
||||||
|
|
||||||
this->det_db_box_thresh = stod(config_map_["det_db_box_thresh"]);
|
this->det_db_box_thresh = stod(config_map_["det_db_box_thresh"]);
|
||||||
|
|
||||||
this->det_db_box_thresh = stod(config_map_["det_db_box_thresh"]);
|
this->det_db_unclip_ratio = stod(config_map_["det_db_unclip_ratio"]);
|
||||||
|
|
||||||
this->det_model_dir.assign(config_map_["det_model_dir"]);
|
this->det_model_dir.assign(config_map_["det_model_dir"]);
|
||||||
|
|
||||||
|
@ -55,6 +57,12 @@ public:
|
||||||
|
|
||||||
this->char_list_file.assign(config_map_["char_list_file"]);
|
this->char_list_file.assign(config_map_["char_list_file"]);
|
||||||
|
|
||||||
|
this->use_angle_cls = bool(stoi(config_map_["use_angle_cls"]));
|
||||||
|
|
||||||
|
this->cls_model_dir.assign(config_map_["cls_model_dir"]);
|
||||||
|
|
||||||
|
this->cls_thresh = stod(config_map_["cls_thresh"]);
|
||||||
|
|
||||||
this->visualize = bool(stoi(config_map_["visualize"]));
|
this->visualize = bool(stoi(config_map_["visualize"]));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -68,6 +76,8 @@ public:
|
||||||
|
|
||||||
bool use_mkldnn = false;
|
bool use_mkldnn = false;
|
||||||
|
|
||||||
|
bool use_zero_copy_run = false;
|
||||||
|
|
||||||
int max_side_len = 960;
|
int max_side_len = 960;
|
||||||
|
|
||||||
double det_db_thresh = 0.3;
|
double det_db_thresh = 0.3;
|
||||||
|
@ -80,8 +90,14 @@ public:
|
||||||
|
|
||||||
std::string rec_model_dir;
|
std::string rec_model_dir;
|
||||||
|
|
||||||
|
bool use_angle_cls;
|
||||||
|
|
||||||
std::string char_list_file;
|
std::string char_list_file;
|
||||||
|
|
||||||
|
std::string cls_model_dir;
|
||||||
|
|
||||||
|
double cls_thresh;
|
||||||
|
|
||||||
bool visualize = true;
|
bool visualize = true;
|
||||||
|
|
||||||
void PrintConfigInfo();
|
void PrintConfigInfo();
|
||||||
|
|
|
@ -0,0 +1,81 @@
|
||||||
|
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
|
||||||
|
//
|
||||||
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
// you may not use this file except in compliance with the License.
|
||||||
|
// You may obtain a copy of the License at
|
||||||
|
//
|
||||||
|
// http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
//
|
||||||
|
// Unless required by applicable law or agreed to in writing, software
|
||||||
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
// See the License for the specific language governing permissions and
|
||||||
|
// limitations under the License.
|
||||||
|
|
||||||
|
#include "opencv2/core.hpp"
|
||||||
|
#include "opencv2/imgcodecs.hpp"
|
||||||
|
#include "opencv2/imgproc.hpp"
|
||||||
|
#include "paddle_api.h"
|
||||||
|
#include "paddle_inference_api.h"
|
||||||
|
#include <chrono>
|
||||||
|
#include <iomanip>
|
||||||
|
#include <iostream>
|
||||||
|
#include <ostream>
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
#include <cstring>
|
||||||
|
#include <fstream>
|
||||||
|
#include <numeric>
|
||||||
|
|
||||||
|
#include <include/preprocess_op.h>
|
||||||
|
#include <include/utility.h>
|
||||||
|
|
||||||
|
namespace PaddleOCR {
|
||||||
|
|
||||||
|
class Classifier {
|
||||||
|
public:
|
||||||
|
explicit Classifier(const std::string &model_dir, const bool &use_gpu,
|
||||||
|
const int &gpu_id, const int &gpu_mem,
|
||||||
|
const int &cpu_math_library_num_threads,
|
||||||
|
const bool &use_mkldnn, const bool &use_zero_copy_run,
|
||||||
|
const double &cls_thresh) {
|
||||||
|
this->use_gpu_ = use_gpu;
|
||||||
|
this->gpu_id_ = gpu_id;
|
||||||
|
this->gpu_mem_ = gpu_mem;
|
||||||
|
this->cpu_math_library_num_threads_ = cpu_math_library_num_threads;
|
||||||
|
this->use_mkldnn_ = use_mkldnn;
|
||||||
|
this->use_zero_copy_run_ = use_zero_copy_run;
|
||||||
|
|
||||||
|
this->cls_thresh = cls_thresh;
|
||||||
|
|
||||||
|
LoadModel(model_dir);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Load Paddle inference model
|
||||||
|
void LoadModel(const std::string &model_dir);
|
||||||
|
|
||||||
|
cv::Mat Run(cv::Mat &img);
|
||||||
|
|
||||||
|
private:
|
||||||
|
std::shared_ptr<PaddlePredictor> predictor_;
|
||||||
|
|
||||||
|
bool use_gpu_ = false;
|
||||||
|
int gpu_id_ = 0;
|
||||||
|
int gpu_mem_ = 4000;
|
||||||
|
int cpu_math_library_num_threads_ = 4;
|
||||||
|
bool use_mkldnn_ = false;
|
||||||
|
bool use_zero_copy_run_ = false;
|
||||||
|
double cls_thresh = 0.5;
|
||||||
|
|
||||||
|
std::vector<float> mean_ = {0.5f, 0.5f, 0.5f};
|
||||||
|
std::vector<float> scale_ = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
|
||||||
|
bool is_scale_ = true;
|
||||||
|
|
||||||
|
// pre-process
|
||||||
|
ClsResizeImg resize_op_;
|
||||||
|
Normalize normalize_op_;
|
||||||
|
Permute permute_op_;
|
||||||
|
|
||||||
|
}; // class Classifier
|
||||||
|
|
||||||
|
} // namespace PaddleOCR
|
|
@ -39,8 +39,8 @@ public:
|
||||||
explicit DBDetector(const std::string &model_dir, const bool &use_gpu,
|
explicit DBDetector(const std::string &model_dir, const bool &use_gpu,
|
||||||
const int &gpu_id, const int &gpu_mem,
|
const int &gpu_id, const int &gpu_mem,
|
||||||
const int &cpu_math_library_num_threads,
|
const int &cpu_math_library_num_threads,
|
||||||
const bool &use_mkldnn, const int &max_side_len,
|
const bool &use_mkldnn, const bool &use_zero_copy_run,
|
||||||
const double &det_db_thresh,
|
const int &max_side_len, const double &det_db_thresh,
|
||||||
const double &det_db_box_thresh,
|
const double &det_db_box_thresh,
|
||||||
const double &det_db_unclip_ratio,
|
const double &det_db_unclip_ratio,
|
||||||
const bool &visualize) {
|
const bool &visualize) {
|
||||||
|
@ -49,6 +49,7 @@ public:
|
||||||
this->gpu_mem_ = gpu_mem;
|
this->gpu_mem_ = gpu_mem;
|
||||||
this->cpu_math_library_num_threads_ = cpu_math_library_num_threads;
|
this->cpu_math_library_num_threads_ = cpu_math_library_num_threads;
|
||||||
this->use_mkldnn_ = use_mkldnn;
|
this->use_mkldnn_ = use_mkldnn;
|
||||||
|
this->use_zero_copy_run_ = use_zero_copy_run;
|
||||||
|
|
||||||
this->max_side_len_ = max_side_len;
|
this->max_side_len_ = max_side_len;
|
||||||
|
|
||||||
|
@ -75,6 +76,7 @@ private:
|
||||||
int gpu_mem_ = 4000;
|
int gpu_mem_ = 4000;
|
||||||
int cpu_math_library_num_threads_ = 4;
|
int cpu_math_library_num_threads_ = 4;
|
||||||
bool use_mkldnn_ = false;
|
bool use_mkldnn_ = false;
|
||||||
|
bool use_zero_copy_run_ = false;
|
||||||
|
|
||||||
int max_side_len_ = 960;
|
int max_side_len_ = 960;
|
||||||
|
|
||||||
|
|
|
@ -27,6 +27,7 @@
|
||||||
#include <fstream>
|
#include <fstream>
|
||||||
#include <numeric>
|
#include <numeric>
|
||||||
|
|
||||||
|
#include <include/ocr_cls.h>
|
||||||
#include <include/postprocess_op.h>
|
#include <include/postprocess_op.h>
|
||||||
#include <include/preprocess_op.h>
|
#include <include/preprocess_op.h>
|
||||||
#include <include/utility.h>
|
#include <include/utility.h>
|
||||||
|
@ -38,14 +39,17 @@ public:
|
||||||
explicit CRNNRecognizer(const std::string &model_dir, const bool &use_gpu,
|
explicit CRNNRecognizer(const std::string &model_dir, const bool &use_gpu,
|
||||||
const int &gpu_id, const int &gpu_mem,
|
const int &gpu_id, const int &gpu_mem,
|
||||||
const int &cpu_math_library_num_threads,
|
const int &cpu_math_library_num_threads,
|
||||||
const bool &use_mkldnn, const string &label_path) {
|
const bool &use_mkldnn, const bool &use_zero_copy_run,
|
||||||
|
const string &label_path) {
|
||||||
this->use_gpu_ = use_gpu;
|
this->use_gpu_ = use_gpu;
|
||||||
this->gpu_id_ = gpu_id;
|
this->gpu_id_ = gpu_id;
|
||||||
this->gpu_mem_ = gpu_mem;
|
this->gpu_mem_ = gpu_mem;
|
||||||
this->cpu_math_library_num_threads_ = cpu_math_library_num_threads;
|
this->cpu_math_library_num_threads_ = cpu_math_library_num_threads;
|
||||||
this->use_mkldnn_ = use_mkldnn;
|
this->use_mkldnn_ = use_mkldnn;
|
||||||
|
this->use_zero_copy_run_ = use_zero_copy_run;
|
||||||
|
|
||||||
this->label_list_ = Utility::ReadDict(label_path);
|
this->label_list_ = Utility::ReadDict(label_path);
|
||||||
|
this->label_list_.push_back(" ");
|
||||||
|
|
||||||
LoadModel(model_dir);
|
LoadModel(model_dir);
|
||||||
}
|
}
|
||||||
|
@ -53,7 +57,8 @@ public:
|
||||||
// Load Paddle inference model
|
// Load Paddle inference model
|
||||||
void LoadModel(const std::string &model_dir);
|
void LoadModel(const std::string &model_dir);
|
||||||
|
|
||||||
void Run(std::vector<std::vector<std::vector<int>>> boxes, cv::Mat &img);
|
void Run(std::vector<std::vector<std::vector<int>>> boxes, cv::Mat &img,
|
||||||
|
Classifier *cls);
|
||||||
|
|
||||||
private:
|
private:
|
||||||
std::shared_ptr<PaddlePredictor> predictor_;
|
std::shared_ptr<PaddlePredictor> predictor_;
|
||||||
|
@ -63,6 +68,7 @@ private:
|
||||||
int gpu_mem_ = 4000;
|
int gpu_mem_ = 4000;
|
||||||
int cpu_math_library_num_threads_ = 4;
|
int cpu_math_library_num_threads_ = 4;
|
||||||
bool use_mkldnn_ = false;
|
bool use_mkldnn_ = false;
|
||||||
|
bool use_zero_copy_run_ = false;
|
||||||
|
|
||||||
std::vector<std::string> label_list_;
|
std::vector<std::string> label_list_;
|
||||||
|
|
||||||
|
@ -83,4 +89,4 @@ private:
|
||||||
|
|
||||||
}; // class CrnnRecognizer
|
}; // class CrnnRecognizer
|
||||||
|
|
||||||
} // namespace PaddleOCR
|
} // namespace PaddleOCR
|
||||||
|
|
|
@ -56,4 +56,10 @@ public:
|
||||||
const std::vector<int> &rec_image_shape = {3, 32, 320});
|
const std::vector<int> &rec_image_shape = {3, 32, 320});
|
||||||
};
|
};
|
||||||
|
|
||||||
|
class ClsResizeImg {
|
||||||
|
public:
|
||||||
|
virtual void Run(const cv::Mat &img, cv::Mat &resize_img,
|
||||||
|
const std::vector<int> &rec_image_shape = {3, 32, 320});
|
||||||
|
};
|
||||||
|
|
||||||
} // namespace PaddleOCR
|
} // namespace PaddleOCR
|
|
@ -7,6 +7,9 @@
|
||||||
|
|
||||||
### 运行准备
|
### 运行准备
|
||||||
- Linux环境,推荐使用docker。
|
- Linux环境,推荐使用docker。
|
||||||
|
- Windows环境,目前支持基于`Visual Studio 2019 Community`进行编译。
|
||||||
|
|
||||||
|
* 该文档主要介绍基于Linux环境的PaddleOCR C++预测流程,如果需要在Windows下基于预测库进行C++预测,具体编译方法请参考[Windows下编译教程](./docs/windows_vs2019_build.md)
|
||||||
|
|
||||||
### 1.1 编译opencv库
|
### 1.1 编译opencv库
|
||||||
|
|
||||||
|
@ -184,12 +187,15 @@ make -j
|
||||||
|
|
||||||
|
|
||||||
### 运行demo
|
### 运行demo
|
||||||
* 执行以下命令,完成对一幅图像的OCR识别与检测,最终输出
|
* 执行以下命令,完成对一幅图像的OCR识别与检测。
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
sh tools/run.sh
|
sh tools/run.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
* 若需要使用方向分类器,则需要将`tools/config.txt`中的`use_angle_cls`参数修改为1,表示开启方向分类器的预测。
|
||||||
|
|
||||||
|
|
||||||
最终屏幕上会输出检测结果如下。
|
最终屏幕上会输出检测结果如下。
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
|
@ -0,0 +1,215 @@
|
||||||
|
# Server-side C++ inference
|
||||||
|
|
||||||
|
|
||||||
|
In this tutorial, we will introduce the detailed steps of deploying PaddleOCR ultra-lightweight Chinese detection and recognition models on the server side.
|
||||||
|
|
||||||
|
|
||||||
|
## 1. Prepare the environment
|
||||||
|
|
||||||
|
### Environment
|
||||||
|
|
||||||
|
- Linux, docker is recommended.
|
||||||
|
|
||||||
|
|
||||||
|
### 1.1 Compile opencv
|
||||||
|
|
||||||
|
* First of all, you need to download the source code compiled package in the Linux environment from the opencv official website. Taking opencv3.4.7 as an example, the download command is as follows.
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://github.com/opencv/opencv/archive/3.4.7.tar.gz
|
||||||
|
tar -xf 3.4.7.tar.gz
|
||||||
|
```
|
||||||
|
|
||||||
|
Finally, you can see the folder of `opencv-3.4.7/` in the current directory.
|
||||||
|
|
||||||
|
* Compile opencv, the opencv source path (`root_path`) and installation path (`install_path`) should be set by yourself. Enter the opencv source code path and compile it in the following way.
|
||||||
|
|
||||||
|
|
||||||
|
```shell
|
||||||
|
root_path=your_opencv_root_path
|
||||||
|
install_path=${root_path}/opencv3
|
||||||
|
|
||||||
|
rm -rf build
|
||||||
|
mkdir build
|
||||||
|
cd build
|
||||||
|
|
||||||
|
cmake .. \
|
||||||
|
-DCMAKE_INSTALL_PREFIX=${install_path} \
|
||||||
|
-DCMAKE_BUILD_TYPE=Release \
|
||||||
|
-DBUILD_SHARED_LIBS=OFF \
|
||||||
|
-DWITH_IPP=OFF \
|
||||||
|
-DBUILD_IPP_IW=OFF \
|
||||||
|
-DWITH_LAPACK=OFF \
|
||||||
|
-DWITH_EIGEN=OFF \
|
||||||
|
-DCMAKE_INSTALL_LIBDIR=lib64 \
|
||||||
|
-DWITH_ZLIB=ON \
|
||||||
|
-DBUILD_ZLIB=ON \
|
||||||
|
-DWITH_JPEG=ON \
|
||||||
|
-DBUILD_JPEG=ON \
|
||||||
|
-DWITH_PNG=ON \
|
||||||
|
-DBUILD_PNG=ON \
|
||||||
|
-DWITH_TIFF=ON \
|
||||||
|
-DBUILD_TIFF=ON
|
||||||
|
|
||||||
|
make -j
|
||||||
|
make install
|
||||||
|
```
|
||||||
|
|
||||||
|
Among them, `root_path` is the downloaded opencv source code path, and `install_path` is the installation path of opencv. After `make install` is completed, the opencv header file and library file will be generated in this folder for later OCR source code compilation.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
The final file structure under the opencv installation path is as follows.
|
||||||
|
|
||||||
|
```
|
||||||
|
opencv3/
|
||||||
|
|-- bin
|
||||||
|
|-- include
|
||||||
|
|-- lib
|
||||||
|
|-- lib64
|
||||||
|
|-- share
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.2 Compile or download or the Paddle inference library
|
||||||
|
|
||||||
|
* There are 2 ways to obtain the Paddle inference library, described in detail below.
|
||||||
|
|
||||||
|
|
||||||
|
#### 1.2.1 Compile from the source code
|
||||||
|
* If you want to get the latest Paddle inference library features, you can download the latest code from Paddle github repository and compile the inference library from the source code.
|
||||||
|
* You can refer to [Paddle inference library] (https://www.paddlepaddle.org.cn/documentation/docs/en/advanced_guide/inference_deployment/inference/build_and_install_lib_en.html) to get the Paddle source code from github, and then compile To generate the latest inference library. The method of using git to access the code is as follows.
|
||||||
|
|
||||||
|
|
||||||
|
```shell
|
||||||
|
git clone https://github.com/PaddlePaddle/Paddle.git
|
||||||
|
```
|
||||||
|
|
||||||
|
* After entering the Paddle directory, the compilation method is as follows.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
rm -rf build
|
||||||
|
mkdir build
|
||||||
|
cd build
|
||||||
|
|
||||||
|
cmake .. \
|
||||||
|
-DWITH_CONTRIB=OFF \
|
||||||
|
-DWITH_MKL=ON \
|
||||||
|
-DWITH_MKLDNN=ON \
|
||||||
|
-DWITH_TESTING=OFF \
|
||||||
|
-DCMAKE_BUILD_TYPE=Release \
|
||||||
|
-DWITH_INFERENCE_API_TEST=OFF \
|
||||||
|
-DON_INFER=ON \
|
||||||
|
-DWITH_PYTHON=ON
|
||||||
|
make -j
|
||||||
|
make inference_lib_dist
|
||||||
|
```
|
||||||
|
|
||||||
|
For more compilation parameter options, please refer to the official website of the Paddle C++ inference library:[https://www.paddlepaddle.org.cn/documentation/docs/en/advanced_guide/inference_deployment/inference/build_and_install_lib_en.html](https://www.paddlepaddle.org.cn/documentation/docs/en/advanced_guide/inference_deployment/inference/build_and_install_lib_en.html).
|
||||||
|
|
||||||
|
|
||||||
|
* After the compilation process, you can see the following files in the folder of `build/fluid_inference_install_dir/`.
|
||||||
|
|
||||||
|
```
|
||||||
|
build/fluid_inference_install_dir/
|
||||||
|
|-- CMakeCache.txt
|
||||||
|
|-- paddle
|
||||||
|
|-- third_party
|
||||||
|
|-- version.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Among them, `paddle` is the Paddle library required for C++ prediction later, and `version.txt` contains the version information of the current inference library.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### 1.2.2 Direct download and installation
|
||||||
|
|
||||||
|
* Different cuda versions of the Linux inference library (based on GCC 4.8.2) are provided on the
|
||||||
|
[Paddle inference library official website](https://www.paddlepaddle.org.cn/documentation/docs/en/advanced_guide/inference_deployment/inference/build_and_install_lib_en.html). You can view and select the appropriate version of the inference library on the official website.
|
||||||
|
|
||||||
|
|
||||||
|
* After downloading, use the following method to uncompress.
|
||||||
|
|
||||||
|
```
|
||||||
|
tar -xf fluid_inference.tgz
|
||||||
|
```
|
||||||
|
|
||||||
|
Finally you can see the following files in the folder of `fluid_inference/`.
|
||||||
|
|
||||||
|
|
||||||
|
## 2. Compile and run the demo
|
||||||
|
|
||||||
|
### 2.1 Export the inference model
|
||||||
|
|
||||||
|
* You can refer to [Model inference](../../doc/doc_ch/inference.md),export the inference model. After the model is exported, assuming it is placed in the `inference` directory, the directory structure is as follows.
|
||||||
|
|
||||||
|
```
|
||||||
|
inference/
|
||||||
|
|-- det_db
|
||||||
|
| |--model
|
||||||
|
| |--params
|
||||||
|
|-- rec_rcnn
|
||||||
|
| |--model
|
||||||
|
| |--params
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### 2.2 Compile PaddleOCR C++ inference demo
|
||||||
|
|
||||||
|
|
||||||
|
* The compilation commands are as follows. The addresses of Paddle C++ inference library, opencv and other Dependencies need to be replaced with the actual addresses on your own machines.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
sh tools/build.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Specifically, the content in `tools/build.sh` is as follows.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
OPENCV_DIR=your_opencv_dir
|
||||||
|
LIB_DIR=your_paddle_inference_dir
|
||||||
|
CUDA_LIB_DIR=your_cuda_lib_dir
|
||||||
|
CUDNN_LIB_DIR=your_cudnn_lib_dir
|
||||||
|
|
||||||
|
BUILD_DIR=build
|
||||||
|
rm -rf ${BUILD_DIR}
|
||||||
|
mkdir ${BUILD_DIR}
|
||||||
|
cd ${BUILD_DIR}
|
||||||
|
cmake .. \
|
||||||
|
-DPADDLE_LIB=${LIB_DIR} \
|
||||||
|
-DWITH_MKL=ON \
|
||||||
|
-DDEMO_NAME=ocr_system \
|
||||||
|
-DWITH_GPU=OFF \
|
||||||
|
-DWITH_STATIC_LIB=OFF \
|
||||||
|
-DUSE_TENSORRT=OFF \
|
||||||
|
-DOPENCV_DIR=${OPENCV_DIR} \
|
||||||
|
-DCUDNN_LIB=${CUDNN_LIB_DIR} \
|
||||||
|
-DCUDA_LIB=${CUDA_LIB_DIR} \
|
||||||
|
|
||||||
|
make -j
|
||||||
|
```
|
||||||
|
|
||||||
|
`OPENCV_DIR` is the opencv installation path; `LIB_DIR` is the download (`fluid_inference` folder) or the generated Paddle inference library path (`build/fluid_inference_install_dir` folder); `CUDA_LIB_DIR` is the cuda library file path, in docker; it is `/usr/local/cuda/lib64`; `CUDNN_LIB_DIR` is the cudnn library file path, in docker it is `/usr/lib/x86_64-linux-gnu/`.
|
||||||
|
|
||||||
|
|
||||||
|
* After the compilation is completed, an executable file named `ocr_system` will be generated in the `build` folder.
|
||||||
|
|
||||||
|
|
||||||
|
### Run the demo
|
||||||
|
* Execute the following command to complete the OCR recognition and detection of an image.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
sh tools/run.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
* If you want to orientation classifier to correct the detected boxes, you can set `use_angle_cls` in the file `tools/config.txt` as 1 to enable the function.
|
||||||
|
|
||||||
|
The detection results will be shown on the screen, which is as follows.
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<img src="../imgs/cpp_infer_pred_12.png" width="600">
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
### 2.3 Note
|
||||||
|
|
||||||
|
* `MKLDNN` is disabled by default for C++ inference (`use_mkldnn` in `tools/config.txt` is set to 0), if you need to use MKLDNN for inference acceleration, you need to modify `use_mkldnn` to 1, and use the latest version of the Paddle source code to compile the inference library. When using MKLDNN for CPU prediction, if multiple images are predicted at the same time, there will be a memory leak problem (the problem is not present if MKLDNN is disabled). The problem is currently being fixed, and the temporary solution is: when predicting multiple pictures, Re-initialize the recognition (`CRNNRecognizer`) and detection class (`DBDetector`) every 30 pictures or so.
|
|
@ -44,7 +44,7 @@ Config::LoadConfig(const std::string &config_path) {
|
||||||
std::map<std::string, std::string> dict;
|
std::map<std::string, std::string> dict;
|
||||||
for (int i = 0; i < config.size(); i++) {
|
for (int i = 0; i < config.size(); i++) {
|
||||||
// pass for empty line or comment
|
// pass for empty line or comment
|
||||||
if (config[i].size() <= 1 or config[i][0] == '#') {
|
if (config[i].size() <= 1 || config[i][0] == '#') {
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
std::vector<std::string> res = split(config[i], " ");
|
std::vector<std::string> res = split(config[i], " ");
|
||||||
|
|
|
@ -48,20 +48,30 @@ int main(int argc, char **argv) {
|
||||||
|
|
||||||
cv::Mat srcimg = cv::imread(img_path, cv::IMREAD_COLOR);
|
cv::Mat srcimg = cv::imread(img_path, cv::IMREAD_COLOR);
|
||||||
|
|
||||||
DBDetector det(config.det_model_dir, config.use_gpu, config.gpu_id,
|
DBDetector det(
|
||||||
config.gpu_mem, config.cpu_math_library_num_threads,
|
config.det_model_dir, config.use_gpu, config.gpu_id, config.gpu_mem,
|
||||||
config.use_mkldnn, config.max_side_len, config.det_db_thresh,
|
config.cpu_math_library_num_threads, config.use_mkldnn,
|
||||||
config.det_db_box_thresh, config.det_db_unclip_ratio,
|
config.use_zero_copy_run, config.max_side_len, config.det_db_thresh,
|
||||||
config.visualize);
|
config.det_db_box_thresh, config.det_db_unclip_ratio, config.visualize);
|
||||||
|
|
||||||
|
Classifier *cls = nullptr;
|
||||||
|
if (config.use_angle_cls == true) {
|
||||||
|
cls = new Classifier(config.cls_model_dir, config.use_gpu, config.gpu_id,
|
||||||
|
config.gpu_mem, config.cpu_math_library_num_threads,
|
||||||
|
config.use_mkldnn, config.use_zero_copy_run,
|
||||||
|
config.cls_thresh);
|
||||||
|
}
|
||||||
|
|
||||||
CRNNRecognizer rec(config.rec_model_dir, config.use_gpu, config.gpu_id,
|
CRNNRecognizer rec(config.rec_model_dir, config.use_gpu, config.gpu_id,
|
||||||
config.gpu_mem, config.cpu_math_library_num_threads,
|
config.gpu_mem, config.cpu_math_library_num_threads,
|
||||||
config.use_mkldnn, config.char_list_file);
|
config.use_mkldnn, config.use_zero_copy_run,
|
||||||
|
config.char_list_file);
|
||||||
|
|
||||||
auto start = std::chrono::system_clock::now();
|
auto start = std::chrono::system_clock::now();
|
||||||
std::vector<std::vector<std::vector<int>>> boxes;
|
std::vector<std::vector<std::vector<int>>> boxes;
|
||||||
det.Run(srcimg, boxes);
|
det.Run(srcimg, boxes);
|
||||||
|
|
||||||
rec.Run(boxes, srcimg);
|
rec.Run(boxes, srcimg, cls);
|
||||||
|
|
||||||
auto end = std::chrono::system_clock::now();
|
auto end = std::chrono::system_clock::now();
|
||||||
auto duration =
|
auto duration =
|
||||||
|
|
|
@ -0,0 +1,110 @@
|
||||||
|
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
|
||||||
|
//
|
||||||
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
// you may not use this file except in compliance with the License.
|
||||||
|
// You may obtain a copy of the License at
|
||||||
|
//
|
||||||
|
// http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
//
|
||||||
|
// Unless required by applicable law or agreed to in writing, software
|
||||||
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
// See the License for the specific language governing permissions and
|
||||||
|
// limitations under the License.
|
||||||
|
|
||||||
|
#include <include/ocr_cls.h>
|
||||||
|
|
||||||
|
namespace PaddleOCR {
|
||||||
|
|
||||||
|
cv::Mat Classifier::Run(cv::Mat &img) {
|
||||||
|
cv::Mat src_img;
|
||||||
|
img.copyTo(src_img);
|
||||||
|
cv::Mat resize_img;
|
||||||
|
|
||||||
|
std::vector<int> rec_image_shape = {3, 32, 100};
|
||||||
|
int index = 0;
|
||||||
|
float wh_ratio = float(img.cols) / float(img.rows);
|
||||||
|
|
||||||
|
this->resize_op_.Run(img, resize_img, rec_image_shape);
|
||||||
|
|
||||||
|
this->normalize_op_.Run(&resize_img, this->mean_, this->scale_,
|
||||||
|
this->is_scale_);
|
||||||
|
|
||||||
|
std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
|
||||||
|
|
||||||
|
this->permute_op_.Run(&resize_img, input.data());
|
||||||
|
|
||||||
|
// Inference.
|
||||||
|
if (this->use_zero_copy_run_) {
|
||||||
|
auto input_names = this->predictor_->GetInputNames();
|
||||||
|
auto input_t = this->predictor_->GetInputTensor(input_names[0]);
|
||||||
|
input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
|
||||||
|
input_t->copy_from_cpu(input.data());
|
||||||
|
this->predictor_->ZeroCopyRun();
|
||||||
|
} else {
|
||||||
|
paddle::PaddleTensor input_t;
|
||||||
|
input_t.shape = {1, 3, resize_img.rows, resize_img.cols};
|
||||||
|
input_t.data =
|
||||||
|
paddle::PaddleBuf(input.data(), input.size() * sizeof(float));
|
||||||
|
input_t.dtype = PaddleDType::FLOAT32;
|
||||||
|
std::vector<paddle::PaddleTensor> outputs;
|
||||||
|
this->predictor_->Run({input_t}, &outputs, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
std::vector<float> softmax_out;
|
||||||
|
std::vector<int64_t> label_out;
|
||||||
|
auto output_names = this->predictor_->GetOutputNames();
|
||||||
|
auto softmax_out_t = this->predictor_->GetOutputTensor(output_names[0]);
|
||||||
|
auto label_out_t = this->predictor_->GetOutputTensor(output_names[1]);
|
||||||
|
auto softmax_shape_out = softmax_out_t->shape();
|
||||||
|
auto label_shape_out = label_out_t->shape();
|
||||||
|
|
||||||
|
int softmax_out_num =
|
||||||
|
std::accumulate(softmax_shape_out.begin(), softmax_shape_out.end(), 1,
|
||||||
|
std::multiplies<int>());
|
||||||
|
|
||||||
|
int label_out_num =
|
||||||
|
std::accumulate(label_shape_out.begin(), label_shape_out.end(), 1,
|
||||||
|
std::multiplies<int>());
|
||||||
|
softmax_out.resize(softmax_out_num);
|
||||||
|
label_out.resize(label_out_num);
|
||||||
|
|
||||||
|
softmax_out_t->copy_to_cpu(softmax_out.data());
|
||||||
|
label_out_t->copy_to_cpu(label_out.data());
|
||||||
|
|
||||||
|
int label = label_out[0];
|
||||||
|
float score = softmax_out[label];
|
||||||
|
// std::cout << "\nlabel "<<label<<" score: "<<score;
|
||||||
|
if (label % 2 == 1 && score > this->cls_thresh) {
|
||||||
|
cv::rotate(src_img, src_img, 1);
|
||||||
|
}
|
||||||
|
return src_img;
|
||||||
|
}
|
||||||
|
|
||||||
|
void Classifier::LoadModel(const std::string &model_dir) {
|
||||||
|
AnalysisConfig config;
|
||||||
|
config.SetModel(model_dir + "/model", model_dir + "/params");
|
||||||
|
|
||||||
|
if (this->use_gpu_) {
|
||||||
|
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
|
||||||
|
} else {
|
||||||
|
config.DisableGpu();
|
||||||
|
if (this->use_mkldnn_) {
|
||||||
|
config.EnableMKLDNN();
|
||||||
|
}
|
||||||
|
config.SetCpuMathLibraryNumThreads(this->cpu_math_library_num_threads_);
|
||||||
|
}
|
||||||
|
|
||||||
|
// false for zero copy tensor
|
||||||
|
config.SwitchUseFeedFetchOps(!this->use_zero_copy_run_);
|
||||||
|
// true for multiple input
|
||||||
|
config.SwitchSpecifyInputNames(true);
|
||||||
|
|
||||||
|
config.SwitchIrOptim(true);
|
||||||
|
|
||||||
|
config.EnableMemoryOptim();
|
||||||
|
config.DisableGlogInfo();
|
||||||
|
|
||||||
|
this->predictor_ = CreatePaddlePredictor(config);
|
||||||
|
}
|
||||||
|
} // namespace PaddleOCR
|
|
@ -26,12 +26,15 @@ void DBDetector::LoadModel(const std::string &model_dir) {
|
||||||
config.DisableGpu();
|
config.DisableGpu();
|
||||||
if (this->use_mkldnn_) {
|
if (this->use_mkldnn_) {
|
||||||
config.EnableMKLDNN();
|
config.EnableMKLDNN();
|
||||||
|
// cache 10 different shapes for mkldnn to avoid memory leak
|
||||||
|
config.SetMkldnnCacheCapacity(10);
|
||||||
}
|
}
|
||||||
config.SetCpuMathLibraryNumThreads(this->cpu_math_library_num_threads_);
|
config.SetCpuMathLibraryNumThreads(this->cpu_math_library_num_threads_);
|
||||||
}
|
}
|
||||||
|
|
||||||
// false for zero copy tensor
|
// false for zero copy tensor
|
||||||
config.SwitchUseFeedFetchOps(false);
|
// true for commom tensor
|
||||||
|
config.SwitchUseFeedFetchOps(!this->use_zero_copy_run_);
|
||||||
// true for multiple input
|
// true for multiple input
|
||||||
config.SwitchSpecifyInputNames(true);
|
config.SwitchSpecifyInputNames(true);
|
||||||
|
|
||||||
|
@ -59,12 +62,22 @@ void DBDetector::Run(cv::Mat &img,
|
||||||
std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
|
std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
|
||||||
this->permute_op_.Run(&resize_img, input.data());
|
this->permute_op_.Run(&resize_img, input.data());
|
||||||
|
|
||||||
auto input_names = this->predictor_->GetInputNames();
|
// Inference.
|
||||||
auto input_t = this->predictor_->GetInputTensor(input_names[0]);
|
if (this->use_zero_copy_run_) {
|
||||||
input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
|
auto input_names = this->predictor_->GetInputNames();
|
||||||
input_t->copy_from_cpu(input.data());
|
auto input_t = this->predictor_->GetInputTensor(input_names[0]);
|
||||||
|
input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
|
||||||
this->predictor_->ZeroCopyRun();
|
input_t->copy_from_cpu(input.data());
|
||||||
|
this->predictor_->ZeroCopyRun();
|
||||||
|
} else {
|
||||||
|
paddle::PaddleTensor input_t;
|
||||||
|
input_t.shape = {1, 3, resize_img.rows, resize_img.cols};
|
||||||
|
input_t.data =
|
||||||
|
paddle::PaddleBuf(input.data(), input.size() * sizeof(float));
|
||||||
|
input_t.dtype = PaddleDType::FLOAT32;
|
||||||
|
std::vector<paddle::PaddleTensor> outputs;
|
||||||
|
this->predictor_->Run({input_t}, &outputs, 1);
|
||||||
|
}
|
||||||
|
|
||||||
std::vector<float> out_data;
|
std::vector<float> out_data;
|
||||||
auto output_names = this->predictor_->GetOutputNames();
|
auto output_names = this->predictor_->GetOutputNames();
|
||||||
|
@ -95,9 +108,11 @@ void DBDetector::Run(cv::Mat &img,
|
||||||
const double maxvalue = 255;
|
const double maxvalue = 255;
|
||||||
cv::Mat bit_map;
|
cv::Mat bit_map;
|
||||||
cv::threshold(cbuf_map, bit_map, threshold, maxvalue, cv::THRESH_BINARY);
|
cv::threshold(cbuf_map, bit_map, threshold, maxvalue, cv::THRESH_BINARY);
|
||||||
|
cv::Mat dilation_map;
|
||||||
|
cv::Mat dila_ele = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(2,2));
|
||||||
|
cv::dilate(bit_map, dilation_map, dila_ele);
|
||||||
boxes = post_processor_.BoxesFromBitmap(
|
boxes = post_processor_.BoxesFromBitmap(
|
||||||
pred_map, bit_map, this->det_db_box_thresh_, this->det_db_unclip_ratio_);
|
pred_map, dilation_map, this->det_db_box_thresh_, this->det_db_unclip_ratio_);
|
||||||
|
|
||||||
boxes = post_processor_.FilterTagDetRes(boxes, ratio_h, ratio_w, srcimg);
|
boxes = post_processor_.FilterTagDetRes(boxes, ratio_h, ratio_w, srcimg);
|
||||||
|
|
||||||
|
|
|
@ -17,7 +17,7 @@
|
||||||
namespace PaddleOCR {
|
namespace PaddleOCR {
|
||||||
|
|
||||||
void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
|
void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
|
||||||
cv::Mat &img) {
|
cv::Mat &img, Classifier *cls) {
|
||||||
cv::Mat srcimg;
|
cv::Mat srcimg;
|
||||||
img.copyTo(srcimg);
|
img.copyTo(srcimg);
|
||||||
cv::Mat crop_img;
|
cv::Mat crop_img;
|
||||||
|
@ -27,6 +27,9 @@ void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
|
||||||
int index = 0;
|
int index = 0;
|
||||||
for (int i = boxes.size() - 1; i >= 0; i--) {
|
for (int i = boxes.size() - 1; i >= 0; i--) {
|
||||||
crop_img = GetRotateCropImage(srcimg, boxes[i]);
|
crop_img = GetRotateCropImage(srcimg, boxes[i]);
|
||||||
|
if (cls != nullptr) {
|
||||||
|
crop_img = cls->Run(crop_img);
|
||||||
|
}
|
||||||
|
|
||||||
float wh_ratio = float(crop_img.cols) / float(crop_img.rows);
|
float wh_ratio = float(crop_img.cols) / float(crop_img.rows);
|
||||||
|
|
||||||
|
@ -39,18 +42,29 @@ void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
|
||||||
|
|
||||||
this->permute_op_.Run(&resize_img, input.data());
|
this->permute_op_.Run(&resize_img, input.data());
|
||||||
|
|
||||||
auto input_names = this->predictor_->GetInputNames();
|
// Inference.
|
||||||
auto input_t = this->predictor_->GetInputTensor(input_names[0]);
|
if (this->use_zero_copy_run_) {
|
||||||
input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
|
auto input_names = this->predictor_->GetInputNames();
|
||||||
input_t->copy_from_cpu(input.data());
|
auto input_t = this->predictor_->GetInputTensor(input_names[0]);
|
||||||
|
input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
|
||||||
this->predictor_->ZeroCopyRun();
|
input_t->copy_from_cpu(input.data());
|
||||||
|
this->predictor_->ZeroCopyRun();
|
||||||
|
} else {
|
||||||
|
paddle::PaddleTensor input_t;
|
||||||
|
input_t.shape = {1, 3, resize_img.rows, resize_img.cols};
|
||||||
|
input_t.data =
|
||||||
|
paddle::PaddleBuf(input.data(), input.size() * sizeof(float));
|
||||||
|
input_t.dtype = PaddleDType::FLOAT32;
|
||||||
|
std::vector<paddle::PaddleTensor> outputs;
|
||||||
|
this->predictor_->Run({input_t}, &outputs, 1);
|
||||||
|
}
|
||||||
|
|
||||||
std::vector<int64_t> rec_idx;
|
std::vector<int64_t> rec_idx;
|
||||||
auto output_names = this->predictor_->GetOutputNames();
|
auto output_names = this->predictor_->GetOutputNames();
|
||||||
auto output_t = this->predictor_->GetOutputTensor(output_names[0]);
|
auto output_t = this->predictor_->GetOutputTensor(output_names[0]);
|
||||||
auto rec_idx_lod = output_t->lod();
|
auto rec_idx_lod = output_t->lod();
|
||||||
auto shape_out = output_t->shape();
|
auto shape_out = output_t->shape();
|
||||||
|
|
||||||
int out_num = std::accumulate(shape_out.begin(), shape_out.end(), 1,
|
int out_num = std::accumulate(shape_out.begin(), shape_out.end(), 1,
|
||||||
std::multiplies<int>());
|
std::multiplies<int>());
|
||||||
|
|
||||||
|
@ -115,12 +129,15 @@ void CRNNRecognizer::LoadModel(const std::string &model_dir) {
|
||||||
config.DisableGpu();
|
config.DisableGpu();
|
||||||
if (this->use_mkldnn_) {
|
if (this->use_mkldnn_) {
|
||||||
config.EnableMKLDNN();
|
config.EnableMKLDNN();
|
||||||
|
// cache 10 different shapes for mkldnn to avoid memory leak
|
||||||
|
config.SetMkldnnCacheCapacity(10);
|
||||||
}
|
}
|
||||||
config.SetCpuMathLibraryNumThreads(this->cpu_math_library_num_threads_);
|
config.SetCpuMathLibraryNumThreads(this->cpu_math_library_num_threads_);
|
||||||
}
|
}
|
||||||
|
|
||||||
// false for zero copy tensor
|
// false for zero copy tensor
|
||||||
config.SwitchUseFeedFetchOps(false);
|
// true for commom tensor
|
||||||
|
config.SwitchUseFeedFetchOps(!this->use_zero_copy_run_);
|
||||||
// true for multiple input
|
// true for multiple input
|
||||||
config.SwitchSpecifyInputNames(true);
|
config.SwitchSpecifyInputNames(true);
|
||||||
|
|
||||||
|
|
|
@ -219,7 +219,7 @@ PostProcessor::BoxesFromBitmap(const cv::Mat pred, const cv::Mat bitmap,
|
||||||
std::vector<std::vector<std::vector<int>>> boxes;
|
std::vector<std::vector<std::vector<int>>> boxes;
|
||||||
|
|
||||||
for (int _i = 0; _i < num_contours; _i++) {
|
for (int _i = 0; _i < num_contours; _i++) {
|
||||||
if (contours[_i].size() <= 0) {
|
if (contours[_i].size() <= 2) {
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
float ssid;
|
float ssid;
|
||||||
|
@ -294,7 +294,7 @@ PostProcessor::FilterTagDetRes(std::vector<std::vector<std::vector<int>>> boxes,
|
||||||
pow(boxes[n][0][1] - boxes[n][1][1], 2)));
|
pow(boxes[n][0][1] - boxes[n][1][1], 2)));
|
||||||
rect_height = int(sqrt(pow(boxes[n][0][0] - boxes[n][3][0], 2) +
|
rect_height = int(sqrt(pow(boxes[n][0][0] - boxes[n][3][0], 2) +
|
||||||
pow(boxes[n][0][1] - boxes[n][3][1], 2)));
|
pow(boxes[n][0][1] - boxes[n][3][1], 2)));
|
||||||
if (rect_width <= 10 || rect_height <= 10)
|
if (rect_width <= 4 || rect_height <= 4)
|
||||||
continue;
|
continue;
|
||||||
root_points.push_back(boxes[n]);
|
root_points.push_back(boxes[n]);
|
||||||
}
|
}
|
||||||
|
|
|
@ -116,4 +116,26 @@ void CrnnResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img, float wh_ratio,
|
||||||
cv::INTER_LINEAR);
|
cv::INTER_LINEAR);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void ClsResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
|
||||||
|
const std::vector<int> &rec_image_shape) {
|
||||||
|
int imgC, imgH, imgW;
|
||||||
|
imgC = rec_image_shape[0];
|
||||||
|
imgH = rec_image_shape[1];
|
||||||
|
imgW = rec_image_shape[2];
|
||||||
|
|
||||||
|
float ratio = float(img.cols) / float(img.rows);
|
||||||
|
int resize_w, resize_h;
|
||||||
|
if (ceilf(imgH * ratio) > imgW)
|
||||||
|
resize_w = imgW;
|
||||||
|
else
|
||||||
|
resize_w = int(ceilf(imgH * ratio));
|
||||||
|
|
||||||
|
cv::resize(img, resize_img, cv::Size(resize_w, imgH), 0.f, 0.f,
|
||||||
|
cv::INTER_LINEAR);
|
||||||
|
if (resize_w < imgW) {
|
||||||
|
cv::copyMakeBorder(resize_img, resize_img, 0, 0, 0, imgW - resize_w,
|
||||||
|
cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace PaddleOCR
|
} // namespace PaddleOCR
|
|
@ -39,22 +39,21 @@ std::vector<std::string> Utility::ReadDict(const std::string &path) {
|
||||||
void Utility::VisualizeBboxes(
|
void Utility::VisualizeBboxes(
|
||||||
const cv::Mat &srcimg,
|
const cv::Mat &srcimg,
|
||||||
const std::vector<std::vector<std::vector<int>>> &boxes) {
|
const std::vector<std::vector<std::vector<int>>> &boxes) {
|
||||||
cv::Point rook_points[boxes.size()][4];
|
|
||||||
for (int n = 0; n < boxes.size(); n++) {
|
|
||||||
for (int m = 0; m < boxes[0].size(); m++) {
|
|
||||||
rook_points[n][m] = cv::Point(int(boxes[n][m][0]), int(boxes[n][m][1]));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
cv::Mat img_vis;
|
cv::Mat img_vis;
|
||||||
srcimg.copyTo(img_vis);
|
srcimg.copyTo(img_vis);
|
||||||
for (int n = 0; n < boxes.size(); n++) {
|
for (int n = 0; n < boxes.size(); n++) {
|
||||||
const cv::Point *ppt[1] = {rook_points[n]};
|
cv::Point rook_points[4];
|
||||||
|
for (int m = 0; m < boxes[n].size(); m++) {
|
||||||
|
rook_points[m] = cv::Point(int(boxes[n][m][0]), int(boxes[n][m][1]));
|
||||||
|
}
|
||||||
|
|
||||||
|
const cv::Point *ppt[1] = {rook_points};
|
||||||
int npt[] = {4};
|
int npt[] = {4};
|
||||||
cv::polylines(img_vis, ppt, npt, 1, 1, CV_RGB(0, 255, 0), 2, 8, 0);
|
cv::polylines(img_vis, ppt, npt, 1, 1, CV_RGB(0, 255, 0), 2, 8, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
cv::imwrite("./ocr_vis.png", img_vis);
|
cv::imwrite("./ocr_vis.png", img_vis);
|
||||||
std::cout << "The detection visualized image saved in ./ocr_vis.png.pn"
|
std::cout << "The detection visualized image saved in ./ocr_vis.png"
|
||||||
<< std::endl;
|
<< std::endl;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -1,8 +1,7 @@
|
||||||
|
|
||||||
OPENCV_DIR=your_opencv_dir
|
OPENCV_DIR=your_opencv_dir
|
||||||
LIB_DIR=your_paddle_inference_dir
|
LIB_DIR=your_paddle_inference_dir
|
||||||
CUDA_LIB_DIR=your_cuda_lib_dir
|
CUDA_LIB_DIR=your_cuda_lib_dir
|
||||||
CUDNN_LIB_DIR=/your_cudnn_lib_dir
|
CUDNN_LIB_DIR=your_cudnn_lib_dir
|
||||||
|
|
||||||
BUILD_DIR=build
|
BUILD_DIR=build
|
||||||
rm -rf ${BUILD_DIR}
|
rm -rf ${BUILD_DIR}
|
||||||
|
@ -11,7 +10,6 @@ cd ${BUILD_DIR}
|
||||||
cmake .. \
|
cmake .. \
|
||||||
-DPADDLE_LIB=${LIB_DIR} \
|
-DPADDLE_LIB=${LIB_DIR} \
|
||||||
-DWITH_MKL=ON \
|
-DWITH_MKL=ON \
|
||||||
-DDEMO_NAME=ocr_system \
|
|
||||||
-DWITH_GPU=OFF \
|
-DWITH_GPU=OFF \
|
||||||
-DWITH_STATIC_LIB=OFF \
|
-DWITH_STATIC_LIB=OFF \
|
||||||
-DUSE_TENSORRT=OFF \
|
-DUSE_TENSORRT=OFF \
|
||||||
|
|
|
@ -3,20 +3,25 @@ use_gpu 0
|
||||||
gpu_id 0
|
gpu_id 0
|
||||||
gpu_mem 4000
|
gpu_mem 4000
|
||||||
cpu_math_library_num_threads 10
|
cpu_math_library_num_threads 10
|
||||||
use_mkldnn 0
|
use_mkldnn 1
|
||||||
|
use_zero_copy_run 1
|
||||||
|
|
||||||
# det config
|
# det config
|
||||||
max_side_len 960
|
max_side_len 960
|
||||||
det_db_thresh 0.3
|
det_db_thresh 0.3
|
||||||
det_db_box_thresh 0.5
|
det_db_box_thresh 0.5
|
||||||
det_db_unclip_ratio 2.0
|
det_db_unclip_ratio 1.6
|
||||||
det_model_dir ./inference/det_db
|
det_model_dir ./inference/det_db
|
||||||
|
|
||||||
|
# cls config
|
||||||
|
use_angle_cls 0
|
||||||
|
cls_model_dir ./inference/cls
|
||||||
|
cls_thresh 0.9
|
||||||
|
|
||||||
# rec config
|
# rec config
|
||||||
rec_model_dir ./inference/rec_crnn
|
rec_model_dir ./inference/rec_crnn
|
||||||
char_list_file ../../ppocr/utils/ppocr_keys_v1.txt
|
char_list_file ../../ppocr/utils/ppocr_keys_v1.txt
|
||||||
img_path ../../doc/imgs/11.jpg
|
|
||||||
|
|
||||||
# show the detection results
|
# show the detection results
|
||||||
visualize 0
|
visualize 1
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,58 @@
|
||||||
|
English | [简体中文](README_cn.md)
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
Many user hopes package the PaddleOCR service into an docker image, so that it can be quickly released and used in the docker or k8s environment.
|
||||||
|
|
||||||
|
This page provide some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the futrue)
|
||||||
|
|
||||||
|
## 1. Prerequisites
|
||||||
|
|
||||||
|
You need to install the following basic components first:
|
||||||
|
a. Docker
|
||||||
|
b. Graphics driver and CUDA 10.0+(GPU)
|
||||||
|
c. NVIDIA Container Toolkit(GPU,Docker 19.03+ can skip this)
|
||||||
|
d. cuDNN 7.6+(GPU)
|
||||||
|
|
||||||
|
## 2. Build Image
|
||||||
|
a. Download PaddleOCR sourcecode
|
||||||
|
```
|
||||||
|
git clone https://github.com/PaddlePaddle/PaddleOCR.git
|
||||||
|
```
|
||||||
|
b. Goto Dockerfile directory(ps:Need to distinguish between cpu and gpu version, the following takes cpu as an example, gpu version needs to replace the keyword)
|
||||||
|
```
|
||||||
|
cd deploy/docker/cpu
|
||||||
|
```
|
||||||
|
c. Build image
|
||||||
|
```
|
||||||
|
docker build -t paddleocr:cpu .
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3. Start container
|
||||||
|
a. CPU version
|
||||||
|
```
|
||||||
|
sudo docker run -dp 8866:8866 --name paddle_ocr paddleocr:cpu
|
||||||
|
```
|
||||||
|
b. GPU version (base on NVIDIA Container Toolkit)
|
||||||
|
```
|
||||||
|
sudo nvidia-docker run -dp 8866:8866 --name paddle_ocr paddleocr:gpu
|
||||||
|
```
|
||||||
|
c. GPU version (Docker 19.03++)
|
||||||
|
```
|
||||||
|
sudo docker run -dp 8866:8866 --gpus all --name paddle_ocr paddleocr:gpu
|
||||||
|
```
|
||||||
|
d. Check service status(If you can see the following statement then it means completed:Successfully installed ocr_system && Running on http://0.0.0.0:8866/)
|
||||||
|
```
|
||||||
|
docker logs -f paddle_ocr
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Test
|
||||||
|
a. Calculate the Base64 encoding of the picture to be recognized (if you just test, you can use a free online tool, like:https://freeonlinetools24.com/base64-image/)
|
||||||
|
b. Post a service request(sample request in sample_request.txt)
|
||||||
|
|
||||||
|
```
|
||||||
|
curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"Input image Base64 encode(need to delete the code 'data:image/jpg;base64,')\"]}" http://localhost:8866/predict/ocr_system
|
||||||
|
```
|
||||||
|
c. Get resposne(If the call is successful, the following result will be returned)
|
||||||
|
```
|
||||||
|
{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
|
||||||
|
```
|
|
@ -0,0 +1,57 @@
|
||||||
|
[English](README.md) | 简体中文
|
||||||
|
|
||||||
|
## Docker化部署服务
|
||||||
|
在日常项目应用中,相信大家一般都会希望能通过Docker技术,把PaddleOCR服务打包成一个镜像,以便在Docker或k8s环境里,快速发布上线使用。
|
||||||
|
|
||||||
|
本文将提供一些标准化的代码来实现这样的目标。大家通过如下步骤可以把PaddleOCR项目快速发布成可调用的Restful API服务。(目前暂时先实现了基于HubServing模式的部署,后续作者计划增加PaddleServing模式的部署)
|
||||||
|
|
||||||
|
## 1.实施前提准备
|
||||||
|
|
||||||
|
需要先完成如下基本组件的安装:
|
||||||
|
a. Docker环境
|
||||||
|
b. 显卡驱动和CUDA 10.0+(GPU)
|
||||||
|
c. NVIDIA Container Toolkit(GPU,Docker 19.03以上版本可以跳过此步)
|
||||||
|
d. cuDNN 7.6+(GPU)
|
||||||
|
|
||||||
|
## 2.制作镜像
|
||||||
|
a.下载PaddleOCR项目代码
|
||||||
|
```
|
||||||
|
git clone https://github.com/PaddlePaddle/PaddleOCR.git
|
||||||
|
```
|
||||||
|
b.切换至Dockerfile目录(注:需要区分cpu或gpu版本,下文以cpu为例,gpu版本需要替换一下关键字即可)
|
||||||
|
```
|
||||||
|
cd deploy/docker/cpu
|
||||||
|
```
|
||||||
|
c.生成镜像
|
||||||
|
```
|
||||||
|
docker build -t paddleocr:cpu .
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3.启动Docker容器
|
||||||
|
a. CPU 版本
|
||||||
|
```
|
||||||
|
sudo docker run -dp 8866:8866 --name paddle_ocr paddleocr:cpu
|
||||||
|
```
|
||||||
|
b. GPU 版本 (通过NVIDIA Container Toolkit)
|
||||||
|
```
|
||||||
|
sudo nvidia-docker run -dp 8866:8866 --name paddle_ocr paddleocr:gpu
|
||||||
|
```
|
||||||
|
c. GPU 版本 (Docker 19.03以上版本,可以直接用如下命令)
|
||||||
|
```
|
||||||
|
sudo docker run -dp 8866:8866 --gpus all --name paddle_ocr paddleocr:gpu
|
||||||
|
```
|
||||||
|
d. 检查服务运行情况(出现:Successfully installed ocr_system和Running on http://0.0.0.0:8866/等信息,表示运行成功)
|
||||||
|
```
|
||||||
|
docker logs -f paddle_ocr
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4.测试服务
|
||||||
|
a. 计算待识别图片的Base64编码(如果只是测试一下效果,可以通过免费的在线工具实现,如:http://tool.chinaz.com/tools/imgtobase/)
|
||||||
|
b. 发送服务请求(可参见sample_request.txt中的值)
|
||||||
|
```
|
||||||
|
curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"填入图片Base64编码(需要删除'data:image/jpg;base64,')\"]}" http://localhost:8866/predict/ocr_system
|
||||||
|
```
|
||||||
|
c. 返回结果(如果调用成功,会返回如下结果)
|
||||||
|
```
|
||||||
|
{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
|
||||||
|
```
|
|
@ -0,0 +1,28 @@
|
||||||
|
# Version: 1.0.0
|
||||||
|
FROM hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev
|
||||||
|
|
||||||
|
# PaddleOCR base on Python3.7
|
||||||
|
RUN pip3.7 install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN python3.7 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN pip3.7 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN git clone https://gitee.com/PaddlePaddle/PaddleOCR
|
||||||
|
|
||||||
|
WORKDIR /PaddleOCR
|
||||||
|
|
||||||
|
RUN pip3.7 install -r requirments.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN mkdir -p /PaddleOCR/inference
|
||||||
|
# Download orc detect model(light version). if you want to change normal version, you can change ch_det_mv3_db_infer to ch_det_r50_vd_db_infer, also remember change det_model_dir in deploy/hubserving/ocr_system/params.py)
|
||||||
|
ADD https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar /PaddleOCR/inference
|
||||||
|
RUN tar xf /PaddleOCR/inference/ch_det_mv3_db_infer.tar -C /PaddleOCR/inference
|
||||||
|
|
||||||
|
# Download orc recognition model(light version). If you want to change normal version, you can change ch_rec_mv3_crnn_infer to ch_rec_r34_vd_crnn_enhance_infer, also remember change rec_model_dir in deploy/hubserving/ocr_system/params.py)
|
||||||
|
ADD https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar /PaddleOCR/inference
|
||||||
|
RUN tar xf /PaddleOCR/inference/ch_rec_mv3_crnn_infer.tar -C /PaddleOCR/inference
|
||||||
|
|
||||||
|
EXPOSE 8866
|
||||||
|
|
||||||
|
CMD ["/bin/bash","-c","export PYTHONPATH=. && hub install deploy/hubserving/ocr_system/ && hub serving start -m ocr_system"]
|
|
@ -0,0 +1,28 @@
|
||||||
|
# Version: 1.0.0
|
||||||
|
FROM hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7-dev
|
||||||
|
|
||||||
|
# PaddleOCR base on Python3.7
|
||||||
|
RUN pip3.7 install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN python3.7 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN pip3.7 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN git clone https://gitee.com/PaddlePaddle/PaddleOCR
|
||||||
|
|
||||||
|
WORKDIR /home/PaddleOCR
|
||||||
|
|
||||||
|
RUN pip3.7 install -r requirments.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
RUN mkdir -p /PaddleOCR/inference
|
||||||
|
# Download orc detect model(light version). if you want to change normal version, you can change ch_det_mv3_db_infer to ch_det_r50_vd_db_infer, also remember change det_model_dir in deploy/hubserving/ocr_system/params.py)
|
||||||
|
ADD https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar /PaddleOCR/inference
|
||||||
|
RUN tar xf /PaddleOCR/inference/ch_det_mv3_db_infer.tar -C /PaddleOCR/inference
|
||||||
|
|
||||||
|
# Download orc recognition model(light version). If you want to change normal version, you can change ch_rec_mv3_crnn_infer to ch_rec_r34_vd_crnn_enhance_infer, also remember change rec_model_dir in deploy/hubserving/ocr_system/params.py)
|
||||||
|
ADD https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar /PaddleOCR/inference
|
||||||
|
RUN tar xf /PaddleOCR/inference/ch_rec_mv3_crnn_infer.tar -C /PaddleOCR/inference
|
||||||
|
|
||||||
|
EXPOSE 8866
|
||||||
|
|
||||||
|
CMD ["/bin/bash","-c","export PYTHONPATH=. && hub install deploy/hubserving/ocr_system/ && hub serving start -m ocr_system"]
|
File diff suppressed because one or more lines are too long
|
@ -31,7 +31,7 @@ from tools.infer.predict_det import TextDetector
|
||||||
author_email="paddle-dev@baidu.com",
|
author_email="paddle-dev@baidu.com",
|
||||||
type="cv/text_recognition")
|
type="cv/text_recognition")
|
||||||
class OCRDet(hub.Module):
|
class OCRDet(hub.Module):
|
||||||
def _initialize(self, use_gpu=False):
|
def _initialize(self, use_gpu=False, enable_mkldnn=False):
|
||||||
"""
|
"""
|
||||||
initialize with the necessary elements
|
initialize with the necessary elements
|
||||||
"""
|
"""
|
||||||
|
@ -51,6 +51,7 @@ class OCRDet(hub.Module):
|
||||||
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
|
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
|
||||||
)
|
)
|
||||||
cfg.ir_optim = True
|
cfg.ir_optim = True
|
||||||
|
cfg.enable_mkldnn = enable_mkldnn
|
||||||
|
|
||||||
self.text_detector = TextDetector(cfg)
|
self.text_detector = TextDetector(cfg)
|
||||||
|
|
||||||
|
|
|
@ -13,7 +13,7 @@ def read_params():
|
||||||
|
|
||||||
#params for text detector
|
#params for text detector
|
||||||
cfg.det_algorithm = "DB"
|
cfg.det_algorithm = "DB"
|
||||||
cfg.det_model_dir = "./inference/ch_det_mv3_db/"
|
cfg.det_model_dir = "./inference/ch_ppocr_mobile_v1.1_det_infer/"
|
||||||
cfg.det_max_side_len = 960
|
cfg.det_max_side_len = 960
|
||||||
|
|
||||||
#DB parmas
|
#DB parmas
|
||||||
|
@ -36,4 +36,6 @@ def read_params():
|
||||||
# cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
|
# cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
|
||||||
# cfg.use_space_char = True
|
# cfg.use_space_char = True
|
||||||
|
|
||||||
return cfg
|
cfg.use_zero_copy_run = False
|
||||||
|
|
||||||
|
return cfg
|
||||||
|
|
|
@ -31,7 +31,7 @@ from tools.infer.predict_rec import TextRecognizer
|
||||||
author_email="paddle-dev@baidu.com",
|
author_email="paddle-dev@baidu.com",
|
||||||
type="cv/text_recognition")
|
type="cv/text_recognition")
|
||||||
class OCRRec(hub.Module):
|
class OCRRec(hub.Module):
|
||||||
def _initialize(self, use_gpu=False):
|
def _initialize(self, use_gpu=False, enable_mkldnn=False):
|
||||||
"""
|
"""
|
||||||
initialize with the necessary elements
|
initialize with the necessary elements
|
||||||
"""
|
"""
|
||||||
|
@ -51,6 +51,7 @@ class OCRRec(hub.Module):
|
||||||
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
|
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
|
||||||
)
|
)
|
||||||
cfg.ir_optim = True
|
cfg.ir_optim = True
|
||||||
|
cfg.enable_mkldnn = enable_mkldnn
|
||||||
|
|
||||||
self.text_recognizer = TextRecognizer(cfg)
|
self.text_recognizer = TextRecognizer(cfg)
|
||||||
|
|
||||||
|
|
|
@ -28,12 +28,24 @@ def read_params():
|
||||||
|
|
||||||
#params for text recognizer
|
#params for text recognizer
|
||||||
cfg.rec_algorithm = "CRNN"
|
cfg.rec_algorithm = "CRNN"
|
||||||
cfg.rec_model_dir = "./inference/ch_rec_mv3_crnn/"
|
cfg.rec_model_dir = "./inference/ch_ppocr_mobile_v1.1_rec_infer/"
|
||||||
|
|
||||||
cfg.rec_image_shape = "3, 32, 320"
|
cfg.rec_image_shape = "3, 32, 320"
|
||||||
cfg.rec_char_type = 'ch'
|
cfg.rec_char_type = 'ch'
|
||||||
cfg.rec_batch_num = 30
|
cfg.rec_batch_num = 30
|
||||||
|
cfg.max_text_length = 25
|
||||||
|
|
||||||
cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
|
cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
|
||||||
cfg.use_space_char = True
|
cfg.use_space_char = True
|
||||||
|
|
||||||
return cfg
|
#params for text classifier
|
||||||
|
cfg.use_angle_cls = True
|
||||||
|
cfg.cls_model_dir = "./inference/ch_ppocr_mobile_v1.1_cls_infer/"
|
||||||
|
cfg.cls_image_shape = "3, 48, 192"
|
||||||
|
cfg.label_list = ['0', '180']
|
||||||
|
cfg.cls_batch_num = 30
|
||||||
|
cfg.cls_thresh = 0.9
|
||||||
|
|
||||||
|
cfg.use_zero_copy_run = False
|
||||||
|
|
||||||
|
return cfg
|
||||||
|
|
|
@ -31,7 +31,7 @@ from tools.infer.predict_system import TextSystem
|
||||||
author_email="paddle-dev@baidu.com",
|
author_email="paddle-dev@baidu.com",
|
||||||
type="cv/text_recognition")
|
type="cv/text_recognition")
|
||||||
class OCRSystem(hub.Module):
|
class OCRSystem(hub.Module):
|
||||||
def _initialize(self, use_gpu=False):
|
def _initialize(self, use_gpu=False, enable_mkldnn=False):
|
||||||
"""
|
"""
|
||||||
initialize with the necessary elements
|
initialize with the necessary elements
|
||||||
"""
|
"""
|
||||||
|
@ -51,7 +51,8 @@ class OCRSystem(hub.Module):
|
||||||
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
|
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
|
||||||
)
|
)
|
||||||
cfg.ir_optim = True
|
cfg.ir_optim = True
|
||||||
|
cfg.enable_mkldnn = enable_mkldnn
|
||||||
|
|
||||||
self.text_sys = TextSystem(cfg)
|
self.text_sys = TextSystem(cfg)
|
||||||
|
|
||||||
def read_images(self, paths=[]):
|
def read_images(self, paths=[]):
|
||||||
|
|
|
@ -10,10 +10,10 @@ class Config(object):
|
||||||
|
|
||||||
def read_params():
|
def read_params():
|
||||||
cfg = Config()
|
cfg = Config()
|
||||||
|
|
||||||
#params for text detector
|
#params for text detector
|
||||||
cfg.det_algorithm = "DB"
|
cfg.det_algorithm = "DB"
|
||||||
cfg.det_model_dir = "./inference/ch_det_mv3_db/"
|
cfg.det_model_dir = "./inference/ch_ppocr_mobile_v1.1_det_infer/"
|
||||||
cfg.det_max_side_len = 960
|
cfg.det_max_side_len = 960
|
||||||
|
|
||||||
#DB parmas
|
#DB parmas
|
||||||
|
@ -28,12 +28,24 @@ def read_params():
|
||||||
|
|
||||||
#params for text recognizer
|
#params for text recognizer
|
||||||
cfg.rec_algorithm = "CRNN"
|
cfg.rec_algorithm = "CRNN"
|
||||||
cfg.rec_model_dir = "./inference/ch_rec_mv3_crnn/"
|
cfg.rec_model_dir = "./inference/ch_ppocr_mobile_v1.1_rec_infer/"
|
||||||
|
|
||||||
cfg.rec_image_shape = "3, 32, 320"
|
cfg.rec_image_shape = "3, 32, 320"
|
||||||
cfg.rec_char_type = 'ch'
|
cfg.rec_char_type = 'ch'
|
||||||
cfg.rec_batch_num = 30
|
cfg.rec_batch_num = 30
|
||||||
|
cfg.max_text_length = 25
|
||||||
|
|
||||||
cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
|
cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
|
||||||
cfg.use_space_char = True
|
cfg.use_space_char = True
|
||||||
|
|
||||||
return cfg
|
#params for text classifier
|
||||||
|
cfg.use_angle_cls = True
|
||||||
|
cfg.cls_model_dir = "./inference/ch_ppocr_mobile_v1.1_cls_infer/"
|
||||||
|
cfg.cls_image_shape = "3, 48, 192"
|
||||||
|
cfg.label_list = ['0', '180']
|
||||||
|
cfg.cls_batch_num = 30
|
||||||
|
cfg.cls_thresh = 0.9
|
||||||
|
|
||||||
|
cfg.use_zero_copy_run = False
|
||||||
|
|
||||||
|
return cfg
|
||||||
|
|
|
@ -1,10 +1,12 @@
|
||||||
# 服务部署
|
[English](readme_en.md) | 简体中文
|
||||||
|
|
||||||
PaddleOCR提供2种服务部署方式:
|
PaddleOCR提供2种服务部署方式:
|
||||||
- 基于HubServing的部署:已集成到PaddleOCR中([code](https://github.com/PaddlePaddle/PaddleOCR/tree/develop/deploy/hubserving)),按照本教程使用;
|
- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",按照本教程使用;
|
||||||
- 基于PaddleServing的部署:详见PaddleServing官网[demo](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/ocr),后续也将集成到PaddleOCR。
|
- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",使用方法参考[文档](../pdserving/readme.md)。
|
||||||
|
|
||||||
服务部署目录下包括检测、识别、2阶段串联三种服务包,根据需求选择相应的服务包进行安装和启动。目录如下:
|
# 基于PaddleHub Serving的服务部署
|
||||||
|
|
||||||
|
hubserving服务部署目录下包括检测、识别、2阶段串联三种服务包,请根据需求选择相应的服务包进行安装和启动。目录结构如下:
|
||||||
```
|
```
|
||||||
deploy/hubserving/
|
deploy/hubserving/
|
||||||
└─ ocr_det 检测模块服务包
|
└─ ocr_det 检测模块服务包
|
||||||
|
@ -28,23 +30,51 @@ deploy/hubserving/ocr_system/
|
||||||
# 安装paddlehub
|
# 安装paddlehub
|
||||||
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
|
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
# 设置环境变量
|
# 在Linux下设置环境变量
|
||||||
export PYTHONPATH=.
|
export PYTHONPATH=.
|
||||||
```
|
|
||||||
|
|
||||||
### 2. 安装服务模块
|
# 或者,在Windows下设置环境变量
|
||||||
PaddleOCR提供3种服务模块,根据需要安装所需模块。如:
|
SET PYTHONPATH=.
|
||||||
|
```
|
||||||
|
|
||||||
安装检测服务模块:
|
### 2. 下载推理模型
|
||||||
```hub install deploy/hubserving/ocr_det/```
|
安装服务模块前,需要准备推理模型并放到正确路径。默认使用的是v1.1版的超轻量模型,默认模型路径为:
|
||||||
|
```
|
||||||
|
检测模型:./inference/ch_ppocr_mobile_v1.1_det_infer/
|
||||||
|
识别模型:./inference/ch_ppocr_mobile_v1.1_rec_infer/
|
||||||
|
方向分类器:./inference/ch_ppocr_mobile_v1.1_cls_infer/
|
||||||
|
```
|
||||||
|
|
||||||
或,安装识别服务模块:
|
**模型路径可在`params.py`中查看和修改。** 更多模型可以从PaddleOCR提供的[模型库](../../doc/doc_ch/models_list.md)下载,也可以替换成自己训练转换好的模型。
|
||||||
```hub install deploy/hubserving/ocr_rec/```
|
|
||||||
|
|
||||||
或,安装检测+识别串联服务模块:
|
### 3. 安装服务模块
|
||||||
```hub install deploy/hubserving/ocr_system/```
|
PaddleOCR提供3种服务模块,根据需要安装所需模块。
|
||||||
|
|
||||||
### 3. 启动服务
|
* 在Linux环境下,安装示例如下:
|
||||||
|
```shell
|
||||||
|
# 安装检测服务模块:
|
||||||
|
hub install deploy/hubserving/ocr_det/
|
||||||
|
|
||||||
|
# 或,安装识别服务模块:
|
||||||
|
hub install deploy/hubserving/ocr_rec/
|
||||||
|
|
||||||
|
# 或,安装检测+识别串联服务模块:
|
||||||
|
hub install deploy/hubserving/ocr_system/
|
||||||
|
```
|
||||||
|
|
||||||
|
* 在Windows环境下(文件夹的分隔符为`\`),安装示例如下:
|
||||||
|
```shell
|
||||||
|
# 安装检测服务模块:
|
||||||
|
hub install deploy\hubserving\ocr_det\
|
||||||
|
|
||||||
|
# 或,安装识别服务模块:
|
||||||
|
hub install deploy\hubserving\ocr_rec\
|
||||||
|
|
||||||
|
# 或,安装检测+识别串联服务模块:
|
||||||
|
hub install deploy\hubserving\ocr_system\
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. 启动服务
|
||||||
#### 方式1. 命令行命令启动(仅支持CPU)
|
#### 方式1. 命令行命令启动(仅支持CPU)
|
||||||
**启动命令:**
|
**启动命令:**
|
||||||
```shell
|
```shell
|
||||||
|
@ -69,9 +99,9 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
|
||||||
|
|
||||||
#### 方式2. 配置文件启动(支持CPU、GPU)
|
#### 方式2. 配置文件启动(支持CPU、GPU)
|
||||||
**启动命令:**
|
**启动命令:**
|
||||||
```hub serving start --config/-c config.json```
|
```hub serving start -c config.json```
|
||||||
|
|
||||||
其中,`config.json`格式如下:
|
其中,`config.json`格式如下:
|
||||||
```python
|
```python
|
||||||
{
|
{
|
||||||
"modules_info": {
|
"modules_info": {
|
||||||
|
@ -96,6 +126,7 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
|
||||||
**注意:**
|
**注意:**
|
||||||
- 使用配置文件启动服务时,其他参数会被忽略。
|
- 使用配置文件启动服务时,其他参数会被忽略。
|
||||||
- 如果使用GPU预测(即,`use_gpu`置为`true`),则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,如:```export CUDA_VISIBLE_DEVICES=0```,否则不用设置。
|
- 如果使用GPU预测(即,`use_gpu`置为`true`),则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,如:```export CUDA_VISIBLE_DEVICES=0```,否则不用设置。
|
||||||
|
- **`use_gpu`不可与`use_multiprocess`同时为`true`**。
|
||||||
|
|
||||||
如,使用GPU 3号卡启动串联服务:
|
如,使用GPU 3号卡启动串联服务:
|
||||||
```shell
|
```shell
|
||||||
|
@ -120,6 +151,25 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
|
||||||
访问示例:
|
访问示例:
|
||||||
```python tools/test_hubserving.py http://127.0.0.1:8868/predict/ocr_system ./doc/imgs/```
|
```python tools/test_hubserving.py http://127.0.0.1:8868/predict/ocr_system ./doc/imgs/```
|
||||||
|
|
||||||
|
## 返回结果格式说明
|
||||||
|
返回结果为列表(list),列表中的每一项为词典(dict),词典一共可能包含3种字段,信息如下:
|
||||||
|
|
||||||
|
|字段名称|数据类型|意义|
|
||||||
|
|-|-|-|
|
||||||
|
|text|str|文本内容|
|
||||||
|
|confidence|float| 文本识别置信度|
|
||||||
|
|text_region|list|文本位置坐标|
|
||||||
|
|
||||||
|
不同模块返回的字段不同,如,文本识别服务模块返回结果不含`text_region`字段,具体信息如下:
|
||||||
|
|
||||||
|
|字段名/模块名|ocr_det|ocr_rec|ocr_system|
|
||||||
|
|-|-|-|-|
|
||||||
|
|text||✔|✔|
|
||||||
|
|confidence||✔|✔|
|
||||||
|
|text_region|✔||✔|
|
||||||
|
|
||||||
|
**说明:** 如果需要增加、删除、修改返回字段,可在相应模块的`module.py`文件中进行修改,完整流程参考下一节自定义修改服务模块。
|
||||||
|
|
||||||
## 自定义修改服务模块
|
## 自定义修改服务模块
|
||||||
如果需要修改服务逻辑,你一般需要操作以下步骤(以修改`ocr_system`为例):
|
如果需要修改服务逻辑,你一般需要操作以下步骤(以修改`ocr_system`为例):
|
||||||
|
|
||||||
|
@ -127,7 +177,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
|
||||||
```hub serving stop --port/-p XXXX```
|
```hub serving stop --port/-p XXXX```
|
||||||
|
|
||||||
- 2、 到相应的`module.py`和`params.py`等文件中根据实际需求修改代码。
|
- 2、 到相应的`module.py`和`params.py`等文件中根据实际需求修改代码。
|
||||||
例如,如果需要替换部署服务所用模型,则需要到`params.py`中修改模型路径参数`det_model_dir`和`rec_model_dir`,当然,同时可能还需要修改其他相关参数,请根据实际情况修改调试。 建议修改后先直接运行`module.py`调试,能正确运行预测后再启动服务测试。
|
例如,如果需要替换部署服务所用模型,则需要到`params.py`中修改模型路径参数`det_model_dir`和`rec_model_dir`,如果需要关闭文本方向分类器,则将参数`use_angle_cls`置为`False`,当然,同时可能还需要修改其他相关参数,请根据实际情况修改调试。 **强烈建议修改后先直接运行`module.py`调试,能正确运行预测后再启动服务测试。**
|
||||||
|
|
||||||
- 3、 卸载旧服务包
|
- 3、 卸载旧服务包
|
||||||
```hub uninstall ocr_system```
|
```hub uninstall ocr_system```
|
||||||
|
@ -137,4 +187,3 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
|
||||||
|
|
||||||
- 5、重新启动服务
|
- 5、重新启动服务
|
||||||
```hub serving start -m ocr_system```
|
```hub serving start -m ocr_system```
|
||||||
|
|
|
@ -0,0 +1,200 @@
|
||||||
|
English | [简体中文](readme.md)
|
||||||
|
|
||||||
|
PaddleOCR provides 2 service deployment methods:
|
||||||
|
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please follow this tutorial.
|
||||||
|
- Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please refer to the [tutorial](../pdserving/readme_en.md) for usage.
|
||||||
|
|
||||||
|
# Service deployment based on PaddleHub Serving
|
||||||
|
|
||||||
|
The hubserving service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:
|
||||||
|
```
|
||||||
|
deploy/hubserving/
|
||||||
|
└─ ocr_det detection module service package
|
||||||
|
└─ ocr_rec recognition module service package
|
||||||
|
└─ ocr_system two-stage series connection service package
|
||||||
|
```
|
||||||
|
|
||||||
|
Each service pack contains 3 files. Take the 2-stage series connection service package as an example, the directory is as follows:
|
||||||
|
```
|
||||||
|
deploy/hubserving/ocr_system/
|
||||||
|
└─ __init__.py Empty file, required
|
||||||
|
└─ config.json Configuration file, optional, passed in as a parameter when using configuration to start the service
|
||||||
|
└─ module.py Main module file, required, contains the complete logic of the service
|
||||||
|
└─ params.py Parameter file, required, including parameters such as model path, pre- and post-processing parameters
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick start service
|
||||||
|
The following steps take the 2-stage series service as an example. If only the detection service or recognition service is needed, replace the corresponding file path.
|
||||||
|
|
||||||
|
### 1. Prepare the environment
|
||||||
|
```shell
|
||||||
|
# Install paddlehub
|
||||||
|
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||||
|
|
||||||
|
# Set environment variables on Linux
|
||||||
|
export PYTHONPATH=.
|
||||||
|
|
||||||
|
# Set environment variables on Windows
|
||||||
|
SET PYTHONPATH=.
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Download inference model
|
||||||
|
Before installing the service module, you need to prepare the inference model and put it in the correct path. By default, the ultra lightweight model of v1.1 is used, and the default model path is:
|
||||||
|
```
|
||||||
|
detection model: ./inference/ch_ppocr_mobile_v1.1_det_infer/
|
||||||
|
recognition model: ./inference/ch_ppocr_mobile_v1.1_rec_infer/
|
||||||
|
text direction classifier: ./inference/ch_ppocr_mobile_v1.1_cls_infer/
|
||||||
|
```
|
||||||
|
|
||||||
|
**The model path can be found and modified in `params.py`.** More models provided by PaddleOCR can be obtained from the [model library](../../doc/doc_en/models_list_en.md). You can also use models trained by yourself.
|
||||||
|
|
||||||
|
### 3. Install Service Module
|
||||||
|
PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs.
|
||||||
|
|
||||||
|
* On Linux platform, the examples are as follows.
|
||||||
|
```shell
|
||||||
|
# Install the detection service module:
|
||||||
|
hub install deploy/hubserving/ocr_det/
|
||||||
|
|
||||||
|
# Or, install the recognition service module:
|
||||||
|
hub install deploy/hubserving/ocr_rec/
|
||||||
|
|
||||||
|
# Or, install the 2-stage series service module:
|
||||||
|
hub install deploy/hubserving/ocr_system/
|
||||||
|
```
|
||||||
|
|
||||||
|
* On Windows platform, the examples are as follows.
|
||||||
|
```shell
|
||||||
|
# Install the detection service module:
|
||||||
|
hub install deploy\hubserving\ocr_det\
|
||||||
|
|
||||||
|
# Or, install the recognition service module:
|
||||||
|
hub install deploy\hubserving\ocr_rec\
|
||||||
|
|
||||||
|
# Or, install the 2-stage series service module:
|
||||||
|
hub install deploy\hubserving\ocr_system\
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Start service
|
||||||
|
#### Way 1. Start with command line parameters (CPU only)
|
||||||
|
|
||||||
|
**start command:**
|
||||||
|
```shell
|
||||||
|
$ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
|
||||||
|
--port XXXX \
|
||||||
|
--use_multiprocess \
|
||||||
|
--workers \
|
||||||
|
```
|
||||||
|
**parameters:**
|
||||||
|
|
||||||
|
|parameters|usage|
|
||||||
|
|-|-|
|
||||||
|
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
|
||||||
|
|--port/-p|Service port, default is 8866|
|
||||||
|
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|
||||||
|
|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|
|
||||||
|
|
||||||
|
For example, start the 2-stage series service:
|
||||||
|
```shell
|
||||||
|
hub serving start -m ocr_system
|
||||||
|
```
|
||||||
|
|
||||||
|
This completes the deployment of a service API, using the default port number 8866.
|
||||||
|
|
||||||
|
#### Way 2. Start with configuration file(CPU、GPU)
|
||||||
|
**start command:**
|
||||||
|
```shell
|
||||||
|
hub serving start --config/-c config.json
|
||||||
|
```
|
||||||
|
Wherein, the format of `config.json` is as follows:
|
||||||
|
```python
|
||||||
|
{
|
||||||
|
"modules_info": {
|
||||||
|
"ocr_system": {
|
||||||
|
"init_args": {
|
||||||
|
"version": "1.0.0",
|
||||||
|
"use_gpu": true
|
||||||
|
},
|
||||||
|
"predict_args": {
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"port": 8868,
|
||||||
|
"use_multiprocess": false,
|
||||||
|
"workers": 2
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them, **when `use_gpu` is `true`, it means that the GPU is used to start the service**.
|
||||||
|
- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.
|
||||||
|
|
||||||
|
**Note:**
|
||||||
|
- When using the configuration file to start the service, other parameters will be ignored.
|
||||||
|
- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it.
|
||||||
|
- **`use_gpu` and `use_multiprocess` cannot be `true` at the same time.**
|
||||||
|
|
||||||
|
For example, use GPU card No. 3 to start the 2-stage series service:
|
||||||
|
```shell
|
||||||
|
export CUDA_VISIBLE_DEVICES=3
|
||||||
|
hub serving start -c deploy/hubserving/ocr_system/config.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Send prediction requests
|
||||||
|
After the service starts, you can use the following command to send a prediction request to obtain the prediction result:
|
||||||
|
```shell
|
||||||
|
python tools/test_hubserving.py server_url image_path
|
||||||
|
```
|
||||||
|
|
||||||
|
Two parameters need to be passed to the script:
|
||||||
|
- **server_url**:service address,format of which is
|
||||||
|
`http://[ip_address]:[port]/predict/[module_name]`
|
||||||
|
For example, if the detection, recognition and 2-stage serial services are started with provided configuration files, the respective `server_url` would be:
|
||||||
|
`http://127.0.0.1:8866/predict/ocr_det`
|
||||||
|
`http://127.0.0.1:8867/predict/ocr_rec`
|
||||||
|
`http://127.0.0.1:8868/predict/ocr_system`
|
||||||
|
- **image_path**:Test image path, can be a single image path or an image directory path
|
||||||
|
|
||||||
|
**Eg.**
|
||||||
|
```shell
|
||||||
|
python tools/test_hubserving.py http://127.0.0.1:8868/predict/ocr_system ./doc/imgs/
|
||||||
|
```
|
||||||
|
|
||||||
|
## Returned result format
|
||||||
|
The returned result is a list. Each item in the list is a dict. The dict may contain three fields. The information is as follows:
|
||||||
|
|
||||||
|
|field name|data type|description|
|
||||||
|
|-|-|-|
|
||||||
|
|text|str|text content|
|
||||||
|
|confidence|float|text recognition confidence|
|
||||||
|
|text_region|list|text location coordinates|
|
||||||
|
|
||||||
|
The fields returned by different modules are different. For example, the results returned by the text recognition service module do not contain `text_region`. The details are as follows:
|
||||||
|
|
||||||
|
|field name/module name|ocr_det|ocr_rec|ocr_system|
|
||||||
|
|-|-|-|-|
|
||||||
|
|text||✔|✔|
|
||||||
|
|confidence||✔|✔|
|
||||||
|
|text_region|✔||✔|
|
||||||
|
|
||||||
|
**Note:** If you need to add, delete or modify the returned fields, you can modify the file `module.py` of the corresponding module. For the complete process, refer to the user-defined modification service module in the next section.
|
||||||
|
|
||||||
|
## User defined service module modification
|
||||||
|
If you need to modify the service logic, the following steps are generally required (take the modification of `ocr_system` for example):
|
||||||
|
|
||||||
|
- 1. Stop service
|
||||||
|
```shell
|
||||||
|
hub serving stop --port/-p XXXX
|
||||||
|
```
|
||||||
|
- 2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs.
|
||||||
|
For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `det_model_dir` and `rec_model_dir` in `params.py`. If you want to turn off the text direction classifier, set the parameter `use_angle_cls` to `False`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation. It is suggested to run `module.py` directly for debugging after modification before starting the service test.
|
||||||
|
- 3. Uninstall old service module
|
||||||
|
```shell
|
||||||
|
hub uninstall ocr_system
|
||||||
|
```
|
||||||
|
- 4. Install modified service module
|
||||||
|
```shell
|
||||||
|
hub install deploy/hubserving/ocr_system/
|
||||||
|
```
|
||||||
|
- 5. Restart service
|
||||||
|
```shell
|
||||||
|
hub serving start -m ocr_system
|
||||||
|
```
|
|
@ -0,0 +1,32 @@
|
||||||
|
#!/bin/bash
|
||||||
|
set -e
|
||||||
|
|
||||||
|
OCR_MODEL_URL="https://paddleocr.bj.bcebos.com/deploy/lite/ocr_v1_for_cpu.tar.gz"
|
||||||
|
PADDLE_LITE_LIB_URL="https://paddlelite-demo.bj.bcebos.com/libs/ios/paddle_lite_libs_v2_6_0.tar.gz"
|
||||||
|
OPENCV3_FRAMEWORK_URL="https://paddlelite-demo.bj.bcebos.com/libs/ios/opencv3.framework.tar.gz"
|
||||||
|
|
||||||
|
download_and_extract() {
|
||||||
|
local url="$1"
|
||||||
|
local dst_dir="$2"
|
||||||
|
local tempdir=$(mktemp -d)
|
||||||
|
|
||||||
|
echo "Downloading ${url} ..."
|
||||||
|
curl -L ${url} > ${tempdir}/temp.tar.gz
|
||||||
|
echo "Download ${url} done "
|
||||||
|
|
||||||
|
if [ ! -d ${dst_dir} ];then
|
||||||
|
mkdir -p ${dst_dir}
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "Extracting ..."
|
||||||
|
tar -zxvf ${tempdir}/temp.tar.gz -C ${dst_dir}
|
||||||
|
echo "Extract done "
|
||||||
|
|
||||||
|
rm -rf ${tempdir}
|
||||||
|
}
|
||||||
|
|
||||||
|
echo -e "[Download ios ocr demo denpendancy]\n"
|
||||||
|
download_and_extract "${OCR_MODEL_URL}" "./ocr_demo/models"
|
||||||
|
download_and_extract "${PADDLE_LITE_LIB_URL}" "./ocr_demo"
|
||||||
|
download_and_extract "${OPENCV3_FRAMEWORK_URL}" "./ocr_demo"
|
||||||
|
echo -e "[done]\n"
|
|
@ -0,0 +1,462 @@
|
||||||
|
// !$*UTF8*$!
|
||||||
|
{
|
||||||
|
archiveVersion = 1;
|
||||||
|
classes = {
|
||||||
|
};
|
||||||
|
objectVersion = 50;
|
||||||
|
objects = {
|
||||||
|
|
||||||
|
/* Begin PBXBuildFile section */
|
||||||
|
A98EA8B624CD8E85B9EFADA1 /* OcrData.m in Sources */ = {isa = PBXBuildFile; fileRef = A98EAD74A71FE136D084F392 /* OcrData.m */; };
|
||||||
|
E0A53559219A832A005A6056 /* AppDelegate.m in Sources */ = {isa = PBXBuildFile; fileRef = E0A53558219A832A005A6056 /* AppDelegate.m */; };
|
||||||
|
E0A5355C219A832A005A6056 /* ViewController.mm in Sources */ = {isa = PBXBuildFile; fileRef = E0A5355B219A832A005A6056 /* ViewController.mm */; };
|
||||||
|
E0A5355F219A832A005A6056 /* Main.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = E0A5355D219A832A005A6056 /* Main.storyboard */; };
|
||||||
|
E0A53561219A832E005A6056 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = E0A53560219A832E005A6056 /* Assets.xcassets */; };
|
||||||
|
E0A53564219A832E005A6056 /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = E0A53562219A832E005A6056 /* LaunchScreen.storyboard */; };
|
||||||
|
E0A53567219A832E005A6056 /* main.m in Sources */ = {isa = PBXBuildFile; fileRef = E0A53566219A832E005A6056 /* main.m */; };
|
||||||
|
E0A53576219A89FF005A6056 /* lib in Resources */ = {isa = PBXBuildFile; fileRef = E0A53574219A89FF005A6056 /* lib */; };
|
||||||
|
E0A5357B219AA3D1005A6056 /* AVFoundation.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = E0A5357A219AA3D1005A6056 /* AVFoundation.framework */; };
|
||||||
|
E0A5357D219AA3DF005A6056 /* AssetsLibrary.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = E0A5357C219AA3DE005A6056 /* AssetsLibrary.framework */; };
|
||||||
|
E0A5357F219AA3E7005A6056 /* CoreMedia.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = E0A5357E219AA3E7005A6056 /* CoreMedia.framework */; };
|
||||||
|
ED37528F24B88737008DEBDA /* opencv2.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = ED37528E24B88737008DEBDA /* opencv2.framework */; };
|
||||||
|
ED3752A424B9AD5C008DEBDA /* ocr_clipper.cpp in Sources */ = {isa = PBXBuildFile; fileRef = ED37529E24B9AD5C008DEBDA /* ocr_clipper.cpp */; };
|
||||||
|
ED3752A524B9AD5C008DEBDA /* ocr_db_post_process.cpp in Sources */ = {isa = PBXBuildFile; fileRef = ED3752A024B9AD5C008DEBDA /* ocr_db_post_process.cpp */; };
|
||||||
|
ED3752A624B9AD5C008DEBDA /* ocr_crnn_process.cpp in Sources */ = {isa = PBXBuildFile; fileRef = ED3752A124B9AD5C008DEBDA /* ocr_crnn_process.cpp */; };
|
||||||
|
ED3752A924B9ADB4008DEBDA /* BoxLayer.m in Sources */ = {isa = PBXBuildFile; fileRef = ED3752A824B9ADB4008DEBDA /* BoxLayer.m */; };
|
||||||
|
ED3752AC24B9B118008DEBDA /* Helpers.m in Sources */ = {isa = PBXBuildFile; fileRef = ED3752AA24B9B118008DEBDA /* Helpers.m */; };
|
||||||
|
ED3752B324B9D4C6008DEBDA /* ch_det_mv3_db_opt.nb in Resources */ = {isa = PBXBuildFile; fileRef = ED3752B124B9D4C6008DEBDA /* ch_det_mv3_db_opt.nb */; };
|
||||||
|
ED3752B424B9D4C6008DEBDA /* ch_rec_mv3_crnn_opt.nb in Resources */ = {isa = PBXBuildFile; fileRef = ED3752B224B9D4C6008DEBDA /* ch_rec_mv3_crnn_opt.nb */; };
|
||||||
|
ED3752BC24B9DAD7008DEBDA /* ocr.png in Resources */ = {isa = PBXBuildFile; fileRef = ED3752BA24B9DAD6008DEBDA /* ocr.png */; };
|
||||||
|
ED3752BD24B9DAD7008DEBDA /* label_list.txt in Resources */ = {isa = PBXBuildFile; fileRef = ED3752BB24B9DAD7008DEBDA /* label_list.txt */; };
|
||||||
|
/* End PBXBuildFile section */
|
||||||
|
|
||||||
|
/* Begin PBXFileReference section */
|
||||||
|
A98EAD74A71FE136D084F392 /* OcrData.m */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.objc; path = OcrData.m; sourceTree = "<group>"; };
|
||||||
|
A98EAD76E0669B2BFB628B48 /* OcrData.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = OcrData.h; sourceTree = "<group>"; };
|
||||||
|
E0846DF2230BC93900031405 /* timer.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = timer.h; sourceTree = "<group>"; };
|
||||||
|
E0A18EDD219C03000015DC15 /* face.jpg */ = {isa = PBXFileReference; lastKnownFileType = image.jpeg; path = face.jpg; sourceTree = "<group>"; };
|
||||||
|
E0A53554219A8329005A6056 /* paddle-lite-ocr.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = "paddle-lite-ocr.app"; sourceTree = BUILT_PRODUCTS_DIR; };
|
||||||
|
E0A53557219A832A005A6056 /* AppDelegate.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = AppDelegate.h; sourceTree = "<group>"; };
|
||||||
|
E0A53558219A832A005A6056 /* AppDelegate.m */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.objc; path = AppDelegate.m; sourceTree = "<group>"; };
|
||||||
|
E0A5355A219A832A005A6056 /* ViewController.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = ViewController.h; sourceTree = "<group>"; };
|
||||||
|
E0A5355B219A832A005A6056 /* ViewController.mm */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.objcpp; path = ViewController.mm; sourceTree = "<group>"; };
|
||||||
|
E0A5355E219A832A005A6056 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/Main.storyboard; sourceTree = "<group>"; };
|
||||||
|
E0A53560219A832E005A6056 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = "<group>"; };
|
||||||
|
E0A53563219A832E005A6056 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = "<group>"; };
|
||||||
|
E0A53565219A832E005A6056 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; };
|
||||||
|
E0A53566219A832E005A6056 /* main.m */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.objc; path = main.m; sourceTree = "<group>"; };
|
||||||
|
E0A53574219A89FF005A6056 /* lib */ = {isa = PBXFileReference; lastKnownFileType = folder; path = lib; sourceTree = "<group>"; };
|
||||||
|
E0A5357A219AA3D1005A6056 /* AVFoundation.framework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.framework; name = AVFoundation.framework; path = System/Library/Frameworks/AVFoundation.framework; sourceTree = SDKROOT; };
|
||||||
|
E0A5357C219AA3DE005A6056 /* AssetsLibrary.framework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.framework; name = AssetsLibrary.framework; path = System/Library/Frameworks/AssetsLibrary.framework; sourceTree = SDKROOT; };
|
||||||
|
E0A5357E219AA3E7005A6056 /* CoreMedia.framework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.framework; name = CoreMedia.framework; path = System/Library/Frameworks/CoreMedia.framework; sourceTree = SDKROOT; };
|
||||||
|
ED37528E24B88737008DEBDA /* opencv2.framework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.framework; name = opencv2.framework; path = ocr_demo/opencv2.framework; sourceTree = "<group>"; };
|
||||||
|
ED37529E24B9AD5C008DEBDA /* ocr_clipper.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ocr_clipper.cpp; sourceTree = "<group>"; };
|
||||||
|
ED37529F24B9AD5C008DEBDA /* ocr_db_post_process.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ocr_db_post_process.h; sourceTree = "<group>"; };
|
||||||
|
ED3752A024B9AD5C008DEBDA /* ocr_db_post_process.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ocr_db_post_process.cpp; sourceTree = "<group>"; };
|
||||||
|
ED3752A124B9AD5C008DEBDA /* ocr_crnn_process.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ocr_crnn_process.cpp; sourceTree = "<group>"; };
|
||||||
|
ED3752A224B9AD5C008DEBDA /* ocr_clipper.hpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.h; path = ocr_clipper.hpp; sourceTree = "<group>"; };
|
||||||
|
ED3752A324B9AD5C008DEBDA /* ocr_crnn_process.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ocr_crnn_process.h; sourceTree = "<group>"; };
|
||||||
|
ED3752A724B9ADB4008DEBDA /* BoxLayer.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = BoxLayer.h; sourceTree = "<group>"; };
|
||||||
|
ED3752A824B9ADB4008DEBDA /* BoxLayer.m */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.objc; path = BoxLayer.m; sourceTree = "<group>"; };
|
||||||
|
ED3752AA24B9B118008DEBDA /* Helpers.m */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.objc; path = Helpers.m; sourceTree = "<group>"; };
|
||||||
|
ED3752AB24B9B118008DEBDA /* Helpers.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = Helpers.h; sourceTree = "<group>"; };
|
||||||
|
ED3752B124B9D4C6008DEBDA /* ch_det_mv3_db_opt.nb */ = {isa = PBXFileReference; lastKnownFileType = file; path = ch_det_mv3_db_opt.nb; sourceTree = "<group>"; };
|
||||||
|
ED3752B224B9D4C6008DEBDA /* ch_rec_mv3_crnn_opt.nb */ = {isa = PBXFileReference; lastKnownFileType = file; path = ch_rec_mv3_crnn_opt.nb; sourceTree = "<group>"; };
|
||||||
|
ED3752BA24B9DAD6008DEBDA /* ocr.png */ = {isa = PBXFileReference; lastKnownFileType = image.png; path = ocr.png; sourceTree = "<group>"; };
|
||||||
|
ED3752BB24B9DAD7008DEBDA /* label_list.txt */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text; path = label_list.txt; sourceTree = "<group>"; };
|
||||||
|
/* End PBXFileReference section */
|
||||||
|
|
||||||
|
/* Begin PBXFrameworksBuildPhase section */
|
||||||
|
E0A53551219A8329005A6056 /* Frameworks */ = {
|
||||||
|
isa = PBXFrameworksBuildPhase;
|
||||||
|
buildActionMask = 2147483647;
|
||||||
|
files = (
|
||||||
|
ED37528F24B88737008DEBDA /* opencv2.framework in Frameworks */,
|
||||||
|
E0A5357F219AA3E7005A6056 /* CoreMedia.framework in Frameworks */,
|
||||||
|
E0A5357D219AA3DF005A6056 /* AssetsLibrary.framework in Frameworks */,
|
||||||
|
E0A5357B219AA3D1005A6056 /* AVFoundation.framework in Frameworks */,
|
||||||
|
);
|
||||||
|
runOnlyForDeploymentPostprocessing = 0;
|
||||||
|
};
|
||||||
|
/* End PBXFrameworksBuildPhase section */
|
||||||
|
|
||||||
|
/* Begin PBXGroup section */
|
||||||
|
E0846DE8230658DC00031405 /* models */ = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
ED3752B124B9D4C6008DEBDA /* ch_det_mv3_db_opt.nb */,
|
||||||
|
ED3752B224B9D4C6008DEBDA /* ch_rec_mv3_crnn_opt.nb */,
|
||||||
|
);
|
||||||
|
path = models;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
E0A5354B219A8329005A6056 = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
E0A53556219A8329005A6056 /* ocr_demo */,
|
||||||
|
E0A53555219A8329005A6056 /* Products */,
|
||||||
|
E0A53570219A8945005A6056 /* Frameworks */,
|
||||||
|
);
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
E0A53555219A8329005A6056 /* Products */ = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
E0A53554219A8329005A6056 /* paddle-lite-ocr.app */,
|
||||||
|
);
|
||||||
|
name = Products;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
E0A53556219A8329005A6056 /* ocr_demo */ = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
ED3752BB24B9DAD7008DEBDA /* label_list.txt */,
|
||||||
|
ED3752BA24B9DAD6008DEBDA /* ocr.png */,
|
||||||
|
ED3752AB24B9B118008DEBDA /* Helpers.h */,
|
||||||
|
ED3752AA24B9B118008DEBDA /* Helpers.m */,
|
||||||
|
ED3752A724B9ADB4008DEBDA /* BoxLayer.h */,
|
||||||
|
ED3752A824B9ADB4008DEBDA /* BoxLayer.m */,
|
||||||
|
ED37529D24B9AD5C008DEBDA /* pdocr */,
|
||||||
|
E0846DE8230658DC00031405 /* models */,
|
||||||
|
E0A18EDD219C03000015DC15 /* face.jpg */,
|
||||||
|
E0A53574219A89FF005A6056 /* lib */,
|
||||||
|
E0A53557219A832A005A6056 /* AppDelegate.h */,
|
||||||
|
E0A53558219A832A005A6056 /* AppDelegate.m */,
|
||||||
|
E0846DF2230BC93900031405 /* timer.h */,
|
||||||
|
E0A5355A219A832A005A6056 /* ViewController.h */,
|
||||||
|
E0A5355B219A832A005A6056 /* ViewController.mm */,
|
||||||
|
E0A5355D219A832A005A6056 /* Main.storyboard */,
|
||||||
|
E0A53560219A832E005A6056 /* Assets.xcassets */,
|
||||||
|
E0A53562219A832E005A6056 /* LaunchScreen.storyboard */,
|
||||||
|
E0A53565219A832E005A6056 /* Info.plist */,
|
||||||
|
E0A53566219A832E005A6056 /* main.m */,
|
||||||
|
A98EAD74A71FE136D084F392 /* OcrData.m */,
|
||||||
|
A98EAD76E0669B2BFB628B48 /* OcrData.h */,
|
||||||
|
);
|
||||||
|
path = ocr_demo;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
E0A53570219A8945005A6056 /* Frameworks */ = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
ED37528E24B88737008DEBDA /* opencv2.framework */,
|
||||||
|
E0A5357E219AA3E7005A6056 /* CoreMedia.framework */,
|
||||||
|
E0A5357C219AA3DE005A6056 /* AssetsLibrary.framework */,
|
||||||
|
E0A5357A219AA3D1005A6056 /* AVFoundation.framework */,
|
||||||
|
);
|
||||||
|
name = Frameworks;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
ED37529D24B9AD5C008DEBDA /* pdocr */ = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
ED37529E24B9AD5C008DEBDA /* ocr_clipper.cpp */,
|
||||||
|
ED37529F24B9AD5C008DEBDA /* ocr_db_post_process.h */,
|
||||||
|
ED3752A024B9AD5C008DEBDA /* ocr_db_post_process.cpp */,
|
||||||
|
ED3752A124B9AD5C008DEBDA /* ocr_crnn_process.cpp */,
|
||||||
|
ED3752A224B9AD5C008DEBDA /* ocr_clipper.hpp */,
|
||||||
|
ED3752A324B9AD5C008DEBDA /* ocr_crnn_process.h */,
|
||||||
|
);
|
||||||
|
path = pdocr;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
/* End PBXGroup section */
|
||||||
|
|
||||||
|
/* Begin PBXNativeTarget section */
|
||||||
|
E0A53553219A8329005A6056 /* ocr_demo */ = {
|
||||||
|
isa = PBXNativeTarget;
|
||||||
|
buildConfigurationList = E0A5356A219A832E005A6056 /* Build configuration list for PBXNativeTarget "ocr_demo" */;
|
||||||
|
buildPhases = (
|
||||||
|
E0A53550219A8329005A6056 /* Sources */,
|
||||||
|
E0A53551219A8329005A6056 /* Frameworks */,
|
||||||
|
E0A53552219A8329005A6056 /* Resources */,
|
||||||
|
);
|
||||||
|
buildRules = (
|
||||||
|
);
|
||||||
|
dependencies = (
|
||||||
|
);
|
||||||
|
name = ocr_demo;
|
||||||
|
productName = seg_demo;
|
||||||
|
productReference = E0A53554219A8329005A6056 /* paddle-lite-ocr.app */;
|
||||||
|
productType = "com.apple.product-type.application";
|
||||||
|
};
|
||||||
|
/* End PBXNativeTarget section */
|
||||||
|
|
||||||
|
/* Begin PBXProject section */
|
||||||
|
E0A5354C219A8329005A6056 /* Project object */ = {
|
||||||
|
isa = PBXProject;
|
||||||
|
attributes = {
|
||||||
|
LastUpgradeCheck = 1010;
|
||||||
|
ORGANIZATIONNAME = "Li,Xiaoyang(SYS)";
|
||||||
|
TargetAttributes = {
|
||||||
|
E0A53553219A8329005A6056 = {
|
||||||
|
CreatedOnToolsVersion = 10.1;
|
||||||
|
};
|
||||||
|
};
|
||||||
|
};
|
||||||
|
buildConfigurationList = E0A5354F219A8329005A6056 /* Build configuration list for PBXProject "ocr_demo" */;
|
||||||
|
compatibilityVersion = "Xcode 9.3";
|
||||||
|
developmentRegion = en;
|
||||||
|
hasScannedForEncodings = 0;
|
||||||
|
knownRegions = (
|
||||||
|
en,
|
||||||
|
Base,
|
||||||
|
);
|
||||||
|
mainGroup = E0A5354B219A8329005A6056;
|
||||||
|
productRefGroup = E0A53555219A8329005A6056 /* Products */;
|
||||||
|
projectDirPath = "";
|
||||||
|
projectRoot = "";
|
||||||
|
targets = (
|
||||||
|
E0A53553219A8329005A6056 /* ocr_demo */,
|
||||||
|
);
|
||||||
|
};
|
||||||
|
/* End PBXProject section */
|
||||||
|
|
||||||
|
/* Begin PBXResourcesBuildPhase section */
|
||||||
|
E0A53552219A8329005A6056 /* Resources */ = {
|
||||||
|
isa = PBXResourcesBuildPhase;
|
||||||
|
buildActionMask = 2147483647;
|
||||||
|
files = (
|
||||||
|
E0A53564219A832E005A6056 /* LaunchScreen.storyboard in Resources */,
|
||||||
|
ED3752B324B9D4C6008DEBDA /* ch_det_mv3_db_opt.nb in Resources */,
|
||||||
|
E0A53576219A89FF005A6056 /* lib in Resources */,
|
||||||
|
ED3752B424B9D4C6008DEBDA /* ch_rec_mv3_crnn_opt.nb in Resources */,
|
||||||
|
E0A53561219A832E005A6056 /* Assets.xcassets in Resources */,
|
||||||
|
ED3752BC24B9DAD7008DEBDA /* ocr.png in Resources */,
|
||||||
|
E0A5355F219A832A005A6056 /* Main.storyboard in Resources */,
|
||||||
|
ED3752BD24B9DAD7008DEBDA /* label_list.txt in Resources */,
|
||||||
|
);
|
||||||
|
runOnlyForDeploymentPostprocessing = 0;
|
||||||
|
};
|
||||||
|
/* End PBXResourcesBuildPhase section */
|
||||||
|
|
||||||
|
/* Begin PBXSourcesBuildPhase section */
|
||||||
|
E0A53550219A8329005A6056 /* Sources */ = {
|
||||||
|
isa = PBXSourcesBuildPhase;
|
||||||
|
buildActionMask = 2147483647;
|
||||||
|
files = (
|
||||||
|
E0A5355C219A832A005A6056 /* ViewController.mm in Sources */,
|
||||||
|
ED3752A924B9ADB4008DEBDA /* BoxLayer.m in Sources */,
|
||||||
|
E0A53567219A832E005A6056 /* main.m in Sources */,
|
||||||
|
ED3752A424B9AD5C008DEBDA /* ocr_clipper.cpp in Sources */,
|
||||||
|
E0A53559219A832A005A6056 /* AppDelegate.m in Sources */,
|
||||||
|
ED3752A524B9AD5C008DEBDA /* ocr_db_post_process.cpp in Sources */,
|
||||||
|
ED3752AC24B9B118008DEBDA /* Helpers.m in Sources */,
|
||||||
|
ED3752A624B9AD5C008DEBDA /* ocr_crnn_process.cpp in Sources */,
|
||||||
|
A98EA8B624CD8E85B9EFADA1 /* OcrData.m in Sources */,
|
||||||
|
);
|
||||||
|
runOnlyForDeploymentPostprocessing = 0;
|
||||||
|
};
|
||||||
|
/* End PBXSourcesBuildPhase section */
|
||||||
|
|
||||||
|
/* Begin PBXVariantGroup section */
|
||||||
|
E0A5355D219A832A005A6056 /* Main.storyboard */ = {
|
||||||
|
isa = PBXVariantGroup;
|
||||||
|
children = (
|
||||||
|
E0A5355E219A832A005A6056 /* Base */,
|
||||||
|
);
|
||||||
|
name = Main.storyboard;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
E0A53562219A832E005A6056 /* LaunchScreen.storyboard */ = {
|
||||||
|
isa = PBXVariantGroup;
|
||||||
|
children = (
|
||||||
|
E0A53563219A832E005A6056 /* Base */,
|
||||||
|
);
|
||||||
|
name = LaunchScreen.storyboard;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
|
/* End PBXVariantGroup section */
|
||||||
|
|
||||||
|
/* Begin XCBuildConfiguration section */
|
||||||
|
E0A53568219A832E005A6056 /* Debug */ = {
|
||||||
|
isa = XCBuildConfiguration;
|
||||||
|
buildSettings = {
|
||||||
|
ALWAYS_SEARCH_USER_PATHS = NO;
|
||||||
|
CLANG_ANALYZER_NONNULL = YES;
|
||||||
|
CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;
|
||||||
|
CLANG_CXX_LANGUAGE_STANDARD = "gnu++14";
|
||||||
|
CLANG_CXX_LIBRARY = "libc++";
|
||||||
|
CLANG_ENABLE_MODULES = YES;
|
||||||
|
CLANG_ENABLE_OBJC_ARC = YES;
|
||||||
|
CLANG_ENABLE_OBJC_WEAK = YES;
|
||||||
|
CLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;
|
||||||
|
CLANG_WARN_BOOL_CONVERSION = YES;
|
||||||
|
CLANG_WARN_COMMA = YES;
|
||||||
|
CLANG_WARN_CONSTANT_CONVERSION = YES;
|
||||||
|
CLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;
|
||||||
|
CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;
|
||||||
|
CLANG_WARN_DOCUMENTATION_COMMENTS = YES;
|
||||||
|
CLANG_WARN_EMPTY_BODY = YES;
|
||||||
|
CLANG_WARN_ENUM_CONVERSION = YES;
|
||||||
|
CLANG_WARN_INFINITE_RECURSION = YES;
|
||||||
|
CLANG_WARN_INT_CONVERSION = YES;
|
||||||
|
CLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;
|
||||||
|
CLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;
|
||||||
|
CLANG_WARN_OBJC_LITERAL_CONVERSION = YES;
|
||||||
|
CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;
|
||||||
|
CLANG_WARN_RANGE_LOOP_ANALYSIS = YES;
|
||||||
|
CLANG_WARN_STRICT_PROTOTYPES = YES;
|
||||||
|
CLANG_WARN_SUSPICIOUS_MOVE = YES;
|
||||||
|
CLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;
|
||||||
|
CLANG_WARN_UNREACHABLE_CODE = YES;
|
||||||
|
CLANG_WARN__DUPLICATE_METHOD_MATCH = YES;
|
||||||
|
CODE_SIGN_IDENTITY = "iPhone Developer";
|
||||||
|
COPY_PHASE_STRIP = NO;
|
||||||
|
DEBUG_INFORMATION_FORMAT = dwarf;
|
||||||
|
ENABLE_STRICT_OBJC_MSGSEND = YES;
|
||||||
|
ENABLE_TESTABILITY = YES;
|
||||||
|
GCC_C_LANGUAGE_STANDARD = gnu11;
|
||||||
|
GCC_DYNAMIC_NO_PIC = NO;
|
||||||
|
GCC_NO_COMMON_BLOCKS = YES;
|
||||||
|
GCC_OPTIMIZATION_LEVEL = 0;
|
||||||
|
GCC_PREPROCESSOR_DEFINITIONS = (
|
||||||
|
"DEBUG=1",
|
||||||
|
"$(inherited)",
|
||||||
|
);
|
||||||
|
GCC_WARN_64_TO_32_BIT_CONVERSION = YES;
|
||||||
|
GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;
|
||||||
|
GCC_WARN_UNDECLARED_SELECTOR = YES;
|
||||||
|
GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;
|
||||||
|
GCC_WARN_UNUSED_FUNCTION = YES;
|
||||||
|
GCC_WARN_UNUSED_VARIABLE = YES;
|
||||||
|
IPHONEOS_DEPLOYMENT_TARGET = 12.1;
|
||||||
|
MTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;
|
||||||
|
MTL_FAST_MATH = YES;
|
||||||
|
ONLY_ACTIVE_ARCH = YES;
|
||||||
|
SDKROOT = iphoneos;
|
||||||
|
};
|
||||||
|
name = Debug;
|
||||||
|
};
|
||||||
|
E0A53569219A832E005A6056 /* Release */ = {
|
||||||
|
isa = XCBuildConfiguration;
|
||||||
|
buildSettings = {
|
||||||
|
ALWAYS_SEARCH_USER_PATHS = NO;
|
||||||
|
CLANG_ANALYZER_NONNULL = YES;
|
||||||
|
CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;
|
||||||
|
CLANG_CXX_LANGUAGE_STANDARD = "gnu++14";
|
||||||
|
CLANG_CXX_LIBRARY = "libc++";
|
||||||
|
CLANG_ENABLE_MODULES = YES;
|
||||||
|
CLANG_ENABLE_OBJC_ARC = YES;
|
||||||
|
CLANG_ENABLE_OBJC_WEAK = YES;
|
||||||
|
CLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;
|
||||||
|
CLANG_WARN_BOOL_CONVERSION = YES;
|
||||||
|
CLANG_WARN_COMMA = YES;
|
||||||
|
CLANG_WARN_CONSTANT_CONVERSION = YES;
|
||||||
|
CLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;
|
||||||
|
CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;
|
||||||
|
CLANG_WARN_DOCUMENTATION_COMMENTS = YES;
|
||||||
|
CLANG_WARN_EMPTY_BODY = YES;
|
||||||
|
CLANG_WARN_ENUM_CONVERSION = YES;
|
||||||
|
CLANG_WARN_INFINITE_RECURSION = YES;
|
||||||
|
CLANG_WARN_INT_CONVERSION = YES;
|
||||||
|
CLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;
|
||||||
|
CLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;
|
||||||
|
CLANG_WARN_OBJC_LITERAL_CONVERSION = YES;
|
||||||
|
CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;
|
||||||
|
CLANG_WARN_RANGE_LOOP_ANALYSIS = YES;
|
||||||
|
CLANG_WARN_STRICT_PROTOTYPES = YES;
|
||||||
|
CLANG_WARN_SUSPICIOUS_MOVE = YES;
|
||||||
|
CLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;
|
||||||
|
CLANG_WARN_UNREACHABLE_CODE = YES;
|
||||||
|
CLANG_WARN__DUPLICATE_METHOD_MATCH = YES;
|
||||||
|
CODE_SIGN_IDENTITY = "iPhone Developer";
|
||||||
|
COPY_PHASE_STRIP = NO;
|
||||||
|
DEBUG_INFORMATION_FORMAT = "dwarf-with-dsym";
|
||||||
|
ENABLE_NS_ASSERTIONS = NO;
|
||||||
|
ENABLE_STRICT_OBJC_MSGSEND = YES;
|
||||||
|
GCC_C_LANGUAGE_STANDARD = gnu11;
|
||||||
|
GCC_NO_COMMON_BLOCKS = YES;
|
||||||
|
GCC_WARN_64_TO_32_BIT_CONVERSION = YES;
|
||||||
|
GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;
|
||||||
|
GCC_WARN_UNDECLARED_SELECTOR = YES;
|
||||||
|
GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;
|
||||||
|
GCC_WARN_UNUSED_FUNCTION = YES;
|
||||||
|
GCC_WARN_UNUSED_VARIABLE = YES;
|
||||||
|
IPHONEOS_DEPLOYMENT_TARGET = 12.1;
|
||||||
|
MTL_ENABLE_DEBUG_INFO = NO;
|
||||||
|
MTL_FAST_MATH = YES;
|
||||||
|
SDKROOT = iphoneos;
|
||||||
|
VALIDATE_PRODUCT = YES;
|
||||||
|
};
|
||||||
|
name = Release;
|
||||||
|
};
|
||||||
|
E0A5356B219A832E005A6056 /* Debug */ = {
|
||||||
|
isa = XCBuildConfiguration;
|
||||||
|
buildSettings = {
|
||||||
|
ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;
|
||||||
|
CODE_SIGN_IDENTITY = "Apple Development";
|
||||||
|
CODE_SIGN_STYLE = Automatic;
|
||||||
|
DEVELOPMENT_TEAM = "";
|
||||||
|
FRAMEWORK_SEARCH_PATHS = (
|
||||||
|
"$(inherited)",
|
||||||
|
"$(PROJECT_DIR)/ocr_demo",
|
||||||
|
);
|
||||||
|
INFOPLIST_FILE = "$(SRCROOT)/ocr_demo/Info.plist";
|
||||||
|
IPHONEOS_DEPLOYMENT_TARGET = 12.0;
|
||||||
|
LD_RUNPATH_SEARCH_PATHS = (
|
||||||
|
"$(inherited)",
|
||||||
|
"@executable_path/Frameworks",
|
||||||
|
);
|
||||||
|
LIBRARY_SEARCH_PATHS = "$(PROJECT_DIR)/ocr_demo/lib/";
|
||||||
|
OTHER_LDFLAGS = "$(PROJECT_DIR)/ocr_demo/lib/libpaddle_api_light_bundled.a";
|
||||||
|
PRODUCT_BUNDLE_IDENTIFIER = "com.baidu.paddlelite-ocr-demo";
|
||||||
|
PRODUCT_NAME = "paddle-lite-ocr";
|
||||||
|
PROVISIONING_PROFILE_SPECIFIER = "";
|
||||||
|
TARGETED_DEVICE_FAMILY = "1,2";
|
||||||
|
USER_HEADER_SEARCH_PATHS = "\"$(SRCROOT)/ocr_demo/\"";
|
||||||
|
};
|
||||||
|
name = Debug;
|
||||||
|
};
|
||||||
|
E0A5356C219A832E005A6056 /* Release */ = {
|
||||||
|
isa = XCBuildConfiguration;
|
||||||
|
buildSettings = {
|
||||||
|
ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;
|
||||||
|
CODE_SIGN_IDENTITY = "Apple Development";
|
||||||
|
CODE_SIGN_STYLE = Automatic;
|
||||||
|
DEVELOPMENT_TEAM = "";
|
||||||
|
FRAMEWORK_SEARCH_PATHS = (
|
||||||
|
"$(inherited)",
|
||||||
|
"$(PROJECT_DIR)/ocr_demo",
|
||||||
|
);
|
||||||
|
INFOPLIST_FILE = "$(SRCROOT)/ocr_demo/Info.plist";
|
||||||
|
IPHONEOS_DEPLOYMENT_TARGET = 12.0;
|
||||||
|
LD_RUNPATH_SEARCH_PATHS = (
|
||||||
|
"$(inherited)",
|
||||||
|
"@executable_path/Frameworks",
|
||||||
|
);
|
||||||
|
LIBRARY_SEARCH_PATHS = "$(PROJECT_DIR)/ocr_demo/lib/";
|
||||||
|
OTHER_LDFLAGS = "$(PROJECT_DIR)/ocr_demo/lib/libpaddle_api_light_bundled.a";
|
||||||
|
PRODUCT_BUNDLE_IDENTIFIER = "com.baidu.paddlelite-ocr-demo";
|
||||||
|
PRODUCT_NAME = "paddle-lite-ocr";
|
||||||
|
PROVISIONING_PROFILE_SPECIFIER = "";
|
||||||
|
TARGETED_DEVICE_FAMILY = "1,2";
|
||||||
|
USER_HEADER_SEARCH_PATHS = "\"$(SRCROOT)/ocr_demo/\"";
|
||||||
|
};
|
||||||
|
name = Release;
|
||||||
|
};
|
||||||
|
/* End XCBuildConfiguration section */
|
||||||
|
|
||||||
|
/* Begin XCConfigurationList section */
|
||||||
|
E0A5354F219A8329005A6056 /* Build configuration list for PBXProject "ocr_demo" */ = {
|
||||||
|
isa = XCConfigurationList;
|
||||||
|
buildConfigurations = (
|
||||||
|
E0A53568219A832E005A6056 /* Debug */,
|
||||||
|
E0A53569219A832E005A6056 /* Release */,
|
||||||
|
);
|
||||||
|
defaultConfigurationIsVisible = 0;
|
||||||
|
defaultConfigurationName = Release;
|
||||||
|
};
|
||||||
|
E0A5356A219A832E005A6056 /* Build configuration list for PBXNativeTarget "ocr_demo" */ = {
|
||||||
|
isa = XCConfigurationList;
|
||||||
|
buildConfigurations = (
|
||||||
|
E0A5356B219A832E005A6056 /* Debug */,
|
||||||
|
E0A5356C219A832E005A6056 /* Release */,
|
||||||
|
);
|
||||||
|
defaultConfigurationIsVisible = 0;
|
||||||
|
defaultConfigurationName = Release;
|
||||||
|
};
|
||||||
|
/* End XCConfigurationList section */
|
||||||
|
};
|
||||||
|
rootObject = E0A5354C219A8329005A6056 /* Project object */;
|
||||||
|
}
|
|
@ -0,0 +1,17 @@
|
||||||
|
//
|
||||||
|
// AppDelegate.h
|
||||||
|
// seg_demo
|
||||||
|
//
|
||||||
|
// Created by Li,Xiaoyang(SYS) on 2018/11/13.
|
||||||
|
// Copyright © 2018年 Li,Xiaoyang(SYS). All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import <UIKit/UIKit.h>
|
||||||
|
|
||||||
|
@interface AppDelegate : UIResponder <UIApplicationDelegate>
|
||||||
|
|
||||||
|
@property (strong, nonatomic) UIWindow *window;
|
||||||
|
|
||||||
|
|
||||||
|
@end
|
||||||
|
|
|
@ -0,0 +1,51 @@
|
||||||
|
//
|
||||||
|
// AppDelegate.m
|
||||||
|
// seg_demo
|
||||||
|
//
|
||||||
|
// Created by Li,Xiaoyang(SYS) on 2018/11/13.
|
||||||
|
// Copyright © 2018年 Li,Xiaoyang(SYS). All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import "AppDelegate.h"
|
||||||
|
|
||||||
|
@interface AppDelegate ()
|
||||||
|
|
||||||
|
@end
|
||||||
|
|
||||||
|
@implementation AppDelegate
|
||||||
|
|
||||||
|
|
||||||
|
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {
|
||||||
|
// Override point for customization after application launch.
|
||||||
|
return YES;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
- (void)applicationWillResignActive:(UIApplication *)application {
|
||||||
|
// Sent when the application is about to move from active to inactive state. This can occur for certain types of temporary interruptions (such as an incoming phone call or SMS message) or when the user quits the application and it begins the transition to the background state.
|
||||||
|
// Use this method to pause ongoing tasks, disable timers, and invalidate graphics rendering callbacks. Games should use this method to pause the game.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
- (void)applicationDidEnterBackground:(UIApplication *)application {
|
||||||
|
// Use this method to release shared resources, save user data, invalidate timers, and store enough application state information to restore your application to its current state in case it is terminated later.
|
||||||
|
// If your application supports background execution, this method is called instead of applicationWillTerminate: when the user quits.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
- (void)applicationWillEnterForeground:(UIApplication *)application {
|
||||||
|
// Called as part of the transition from the background to the active state; here you can undo many of the changes made on entering the background.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
- (void)applicationDidBecomeActive:(UIApplication *)application {
|
||||||
|
// Restart any tasks that were paused (or not yet started) while the application was inactive. If the application was previously in the background, optionally refresh the user interface.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
- (void)applicationWillTerminate:(UIApplication *)application {
|
||||||
|
// Called when the application is about to terminate. Save data if appropriate. See also applicationDidEnterBackground:.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@end
|
|
@ -0,0 +1,98 @@
|
||||||
|
{
|
||||||
|
"images" : [
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "20x20",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "20x20",
|
||||||
|
"scale" : "3x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "29x29",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "29x29",
|
||||||
|
"scale" : "3x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "40x40",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "40x40",
|
||||||
|
"scale" : "3x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "60x60",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "iphone",
|
||||||
|
"size" : "60x60",
|
||||||
|
"scale" : "3x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "20x20",
|
||||||
|
"scale" : "1x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "20x20",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "29x29",
|
||||||
|
"scale" : "1x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "29x29",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "40x40",
|
||||||
|
"scale" : "1x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "40x40",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "76x76",
|
||||||
|
"scale" : "1x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "76x76",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ipad",
|
||||||
|
"size" : "83.5x83.5",
|
||||||
|
"scale" : "2x"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"idiom" : "ios-marketing",
|
||||||
|
"size" : "1024x1024",
|
||||||
|
"scale" : "1x"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"info" : {
|
||||||
|
"version" : 1,
|
||||||
|
"author" : "xcode"
|
||||||
|
}
|
||||||
|
}
|
|
@ -0,0 +1,6 @@
|
||||||
|
{
|
||||||
|
"info" : {
|
||||||
|
"version" : 1,
|
||||||
|
"author" : "xcode"
|
||||||
|
}
|
||||||
|
}
|
|
@ -0,0 +1,25 @@
|
||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<document type="com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB" version="3.0" toolsVersion="13122.16" targetRuntime="iOS.CocoaTouch" propertyAccessControl="none" useAutolayout="YES" launchScreen="YES" useTraitCollections="YES" useSafeAreas="YES" colorMatched="YES" initialViewController="01J-lp-oVM">
|
||||||
|
<dependencies>
|
||||||
|
<plugIn identifier="com.apple.InterfaceBuilder.IBCocoaTouchPlugin" version="13104.12"/>
|
||||||
|
<capability name="Safe area layout guides" minToolsVersion="9.0"/>
|
||||||
|
<capability name="documents saved in the Xcode 8 format" minToolsVersion="8.0"/>
|
||||||
|
</dependencies>
|
||||||
|
<scenes>
|
||||||
|
<!--View Controller-->
|
||||||
|
<scene sceneID="EHf-IW-A2E">
|
||||||
|
<objects>
|
||||||
|
<viewController id="01J-lp-oVM" sceneMemberID="viewController">
|
||||||
|
<view key="view" contentMode="scaleToFill" id="Ze5-6b-2t3">
|
||||||
|
<rect key="frame" x="0.0" y="0.0" width="375" height="667"/>
|
||||||
|
<autoresizingMask key="autoresizingMask" widthSizable="YES" heightSizable="YES"/>
|
||||||
|
<color key="backgroundColor" red="1" green="1" blue="1" alpha="1" colorSpace="custom" customColorSpace="sRGB"/>
|
||||||
|
<viewLayoutGuide key="safeArea" id="6Tk-OE-BBY"/>
|
||||||
|
</view>
|
||||||
|
</viewController>
|
||||||
|
<placeholder placeholderIdentifier="IBFirstResponder" id="iYj-Kq-Ea1" userLabel="First Responder" sceneMemberID="firstResponder"/>
|
||||||
|
</objects>
|
||||||
|
<point key="canvasLocation" x="53" y="375"/>
|
||||||
|
</scene>
|
||||||
|
</scenes>
|
||||||
|
</document>
|
|
@ -0,0 +1,111 @@
|
||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<document type="com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB" version="3.0" toolsVersion="16097" targetRuntime="iOS.CocoaTouch" propertyAccessControl="none" useAutolayout="YES" useTraitCollections="YES" useSafeAreas="YES" colorMatched="YES" initialViewController="BYZ-38-t0r">
|
||||||
|
<device id="retina4_7" orientation="portrait" appearance="light"/>
|
||||||
|
<dependencies>
|
||||||
|
<deployment identifier="iOS"/>
|
||||||
|
<plugIn identifier="com.apple.InterfaceBuilder.IBCocoaTouchPlugin" version="16087"/>
|
||||||
|
<capability name="Safe area layout guides" minToolsVersion="9.0"/>
|
||||||
|
<capability name="documents saved in the Xcode 8 format" minToolsVersion="8.0"/>
|
||||||
|
</dependencies>
|
||||||
|
<scenes>
|
||||||
|
<!--View Controller-->
|
||||||
|
<scene sceneID="tne-QT-ifu">
|
||||||
|
<objects>
|
||||||
|
<viewController id="BYZ-38-t0r" customClass="ViewController" sceneMemberID="viewController">
|
||||||
|
<view key="view" contentMode="scaleToFill" id="8bC-Xf-vdC">
|
||||||
|
<rect key="frame" x="0.0" y="0.0" width="375" height="667"/>
|
||||||
|
<autoresizingMask key="autoresizingMask" widthSizable="YES" heightSizable="YES"/>
|
||||||
|
<subviews>
|
||||||
|
<switch opaque="NO" contentMode="scaleToFill" horizontalHuggingPriority="750" verticalHuggingPriority="750" contentHorizontalAlignment="center" contentVerticalAlignment="center" translatesAutoresizingMaskIntoConstraints="NO" id="yZw-YR-x44">
|
||||||
|
<rect key="frame" x="114.5" y="624" width="51" height="31"/>
|
||||||
|
</switch>
|
||||||
|
<switch opaque="NO" contentMode="scaleToFill" horizontalHuggingPriority="750" verticalHuggingPriority="750" contentHorizontalAlignment="center" contentVerticalAlignment="center" translatesAutoresizingMaskIntoConstraints="NO" id="wN7-2M-FdP">
|
||||||
|
<rect key="frame" x="16" y="624" width="56" height="31"/>
|
||||||
|
</switch>
|
||||||
|
<label opaque="NO" userInteractionEnabled="NO" contentMode="left" horizontalHuggingPriority="251" verticalHuggingPriority="251" text="前/后摄像头" textAlignment="natural" lineBreakMode="tailTruncation" baselineAdjustment="alignBaselines" adjustsFontSizeToFit="NO" translatesAutoresizingMaskIntoConstraints="NO" id="lu6-Fq-OIg">
|
||||||
|
<rect key="frame" x="93" y="594" width="92" height="21"/>
|
||||||
|
<fontDescription key="fontDescription" type="system" pointSize="17"/>
|
||||||
|
<nil key="textColor"/>
|
||||||
|
<nil key="highlightedColor"/>
|
||||||
|
</label>
|
||||||
|
<label opaque="NO" userInteractionEnabled="NO" contentMode="left" horizontalHuggingPriority="251" verticalHuggingPriority="251" horizontalCompressionResistancePriority="751" text="开启相机" textAlignment="natural" lineBreakMode="tailTruncation" baselineAdjustment="alignBaselines" adjustsFontSizeToFit="NO" translatesAutoresizingMaskIntoConstraints="NO" id="VfD-z1-Okj">
|
||||||
|
<rect key="frame" x="8" y="594" width="70" height="21"/>
|
||||||
|
<fontDescription key="fontDescription" type="system" pointSize="17"/>
|
||||||
|
<nil key="textColor"/>
|
||||||
|
<nil key="highlightedColor"/>
|
||||||
|
</label>
|
||||||
|
<imageView userInteractionEnabled="NO" contentMode="scaleToFill" horizontalHuggingPriority="251" verticalHuggingPriority="251" translatesAutoresizingMaskIntoConstraints="NO" id="ptx-ND-Ywq">
|
||||||
|
<rect key="frame" x="0.0" y="0.0" width="375" height="564"/>
|
||||||
|
</imageView>
|
||||||
|
<label opaque="NO" userInteractionEnabled="NO" contentMode="left" horizontalHuggingPriority="251" verticalHuggingPriority="251" text="" textAlignment="natural" lineBreakMode="tailTruncation" baselineAdjustment="alignBaselines" adjustsFontSizeToFit="NO" translatesAutoresizingMaskIntoConstraints="NO" id="T9y-OT-OQS">
|
||||||
|
<rect key="frame" x="33" y="574" width="309" height="10"/>
|
||||||
|
<constraints>
|
||||||
|
<constraint firstAttribute="height" constant="10" id="pMg-XK-d3N"/>
|
||||||
|
</constraints>
|
||||||
|
<fontDescription key="fontDescription" type="system" pointSize="17"/>
|
||||||
|
<nil key="textColor"/>
|
||||||
|
<nil key="highlightedColor"/>
|
||||||
|
</label>
|
||||||
|
<button opaque="NO" contentMode="scaleToFill" contentHorizontalAlignment="center" contentVerticalAlignment="center" buttonType="roundedRect" lineBreakMode="middleTruncation" translatesAutoresizingMaskIntoConstraints="NO" id="HJ5-UE-PrR">
|
||||||
|
<rect key="frame" x="302" y="620.5" width="43" height="38"/>
|
||||||
|
<fontDescription key="fontDescription" type="system" pointSize="21"/>
|
||||||
|
<state key="normal" title="拍照"/>
|
||||||
|
<connections>
|
||||||
|
<action selector="cap_photo:" destination="BYZ-38-t0r" eventType="touchUpInside" id="PbV-pB-BRY"/>
|
||||||
|
</connections>
|
||||||
|
</button>
|
||||||
|
<switch opaque="NO" contentMode="scaleToFill" horizontalHuggingPriority="750" verticalHuggingPriority="750" contentHorizontalAlignment="center" contentVerticalAlignment="center" translatesAutoresizingMaskIntoConstraints="NO" id="rc6-ZX-igF">
|
||||||
|
<rect key="frame" x="208" y="624" width="51" height="31"/>
|
||||||
|
<connections>
|
||||||
|
<action selector="swith_video_photo:" destination="BYZ-38-t0r" eventType="valueChanged" id="I05-92-4FW"/>
|
||||||
|
</connections>
|
||||||
|
</switch>
|
||||||
|
<label opaque="NO" userInteractionEnabled="NO" contentMode="left" horizontalHuggingPriority="251" verticalHuggingPriority="251" text="视频/拍照" textAlignment="natural" lineBreakMode="tailTruncation" baselineAdjustment="alignBaselines" adjustsFontSizeToFit="NO" translatesAutoresizingMaskIntoConstraints="NO" id="0tm-fo-hjF">
|
||||||
|
<rect key="frame" x="195" y="595" width="75" height="21"/>
|
||||||
|
<fontDescription key="fontDescription" type="system" pointSize="17"/>
|
||||||
|
<nil key="textColor"/>
|
||||||
|
<nil key="highlightedColor"/>
|
||||||
|
</label>
|
||||||
|
</subviews>
|
||||||
|
<color key="backgroundColor" red="1" green="1" blue="1" alpha="1" colorSpace="custom" customColorSpace="sRGB"/>
|
||||||
|
<constraints>
|
||||||
|
<constraint firstItem="VfD-z1-Okj" firstAttribute="top" secondItem="T9y-OT-OQS" secondAttribute="bottom" constant="10" id="BZ1-F0-re1"/>
|
||||||
|
<constraint firstItem="wN7-2M-FdP" firstAttribute="top" secondItem="rc6-ZX-igF" secondAttribute="top" id="Gzf-hC-X6O"/>
|
||||||
|
<constraint firstItem="wN7-2M-FdP" firstAttribute="leading" secondItem="8bC-Xf-vdC" secondAttribute="leadingMargin" id="JB8-MT-bdB"/>
|
||||||
|
<constraint firstItem="lu6-Fq-OIg" firstAttribute="leading" secondItem="VfD-z1-Okj" secondAttribute="trailing" constant="15" id="JbA-wd-hE8"/>
|
||||||
|
<constraint firstItem="wN7-2M-FdP" firstAttribute="centerX" secondItem="VfD-z1-Okj" secondAttribute="centerX" id="LW4-R4-nh2"/>
|
||||||
|
<constraint firstItem="0tm-fo-hjF" firstAttribute="leading" secondItem="lu6-Fq-OIg" secondAttribute="trailing" constant="10" id="NDI-W8-717"/>
|
||||||
|
<constraint firstItem="6Tk-OE-BBY" firstAttribute="trailing" secondItem="ptx-ND-Ywq" secondAttribute="trailing" id="V5z-FH-SFs"/>
|
||||||
|
<constraint firstItem="ptx-ND-Ywq" firstAttribute="leading" secondItem="6Tk-OE-BBY" secondAttribute="leading" id="dET-yr-Mon"/>
|
||||||
|
<constraint firstItem="ptx-ND-Ywq" firstAttribute="top" secondItem="6Tk-OE-BBY" secondAttribute="top" id="fn8-6Z-tv4"/>
|
||||||
|
<constraint firstItem="T9y-OT-OQS" firstAttribute="leading" secondItem="6Tk-OE-BBY" secondAttribute="leading" constant="33" id="iMB-Zg-Hsa"/>
|
||||||
|
<constraint firstItem="VfD-z1-Okj" firstAttribute="leading" secondItem="6Tk-OE-BBY" secondAttribute="leading" constant="8" id="izE-la-Fhu"/>
|
||||||
|
<constraint firstItem="wN7-2M-FdP" firstAttribute="top" secondItem="VfD-z1-Okj" secondAttribute="bottom" constant="9" id="jcU-7c-FNS"/>
|
||||||
|
<constraint firstItem="HJ5-UE-PrR" firstAttribute="centerY" secondItem="rc6-ZX-igF" secondAttribute="centerY" id="lpA-wq-cXI"/>
|
||||||
|
<constraint firstItem="6Tk-OE-BBY" firstAttribute="trailing" secondItem="T9y-OT-OQS" secondAttribute="trailing" constant="33" id="mD1-P0-mgB"/>
|
||||||
|
<constraint firstItem="rc6-ZX-igF" firstAttribute="centerX" secondItem="0tm-fo-hjF" secondAttribute="centerX" id="p5w-6o-OqW"/>
|
||||||
|
<constraint firstItem="rc6-ZX-igF" firstAttribute="top" secondItem="0tm-fo-hjF" secondAttribute="bottom" constant="8" id="rzr-oM-f7f"/>
|
||||||
|
<constraint firstItem="6Tk-OE-BBY" firstAttribute="trailing" secondItem="HJ5-UE-PrR" secondAttribute="trailing" constant="30" id="tYA-x1-MRj"/>
|
||||||
|
<constraint firstItem="T9y-OT-OQS" firstAttribute="top" secondItem="ptx-ND-Ywq" secondAttribute="bottom" constant="10" id="vNp-h8-QF9"/>
|
||||||
|
<constraint firstItem="VfD-z1-Okj" firstAttribute="baseline" secondItem="lu6-Fq-OIg" secondAttribute="baseline" id="wcZ-9g-OTX"/>
|
||||||
|
<constraint firstItem="6Tk-OE-BBY" firstAttribute="bottom" secondItem="wN7-2M-FdP" secondAttribute="bottom" constant="12" id="xm2-Eb-dxp"/>
|
||||||
|
<constraint firstItem="wN7-2M-FdP" firstAttribute="top" secondItem="yZw-YR-x44" secondAttribute="top" id="yHi-Fb-V4o"/>
|
||||||
|
<constraint firstItem="yZw-YR-x44" firstAttribute="centerX" secondItem="lu6-Fq-OIg" secondAttribute="centerX" id="yXW-Ap-sa7"/>
|
||||||
|
<constraint firstItem="VfD-z1-Okj" firstAttribute="centerY" secondItem="lu6-Fq-OIg" secondAttribute="centerY" id="zQ1-gg-Rnh"/>
|
||||||
|
</constraints>
|
||||||
|
<viewLayoutGuide key="safeArea" id="6Tk-OE-BBY"/>
|
||||||
|
</view>
|
||||||
|
<connections>
|
||||||
|
<outlet property="flag_back_cam" destination="yZw-YR-x44" id="z5O-BW-sm7"/>
|
||||||
|
<outlet property="flag_process" destination="wN7-2M-FdP" id="i8h-CM-ida"/>
|
||||||
|
<outlet property="flag_video" destination="rc6-ZX-igF" id="Uch-KB-gwF"/>
|
||||||
|
<outlet property="imageView" destination="ptx-ND-Ywq" id="XjA-C2-hvm"/>
|
||||||
|
<outlet property="result" destination="T9y-OT-OQS" id="6kB-Ha-dfo"/>
|
||||||
|
</connections>
|
||||||
|
</viewController>
|
||||||
|
<placeholder placeholderIdentifier="IBFirstResponder" id="dkx-z0-nzr" sceneMemberID="firstResponder"/>
|
||||||
|
</objects>
|
||||||
|
<point key="canvasLocation" x="53.600000000000001" y="62.518740629685162"/>
|
||||||
|
</scene>
|
||||||
|
</scenes>
|
||||||
|
</document>
|
|
@ -0,0 +1,20 @@
|
||||||
|
//
|
||||||
|
// Created by chenxiaoyu on 2018/5/5.
|
||||||
|
// Copyright (c) 2018 baidu. All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import <Foundation/Foundation.h>
|
||||||
|
#import <UIKit/UIKit.h>
|
||||||
|
#import "OcrData.h"
|
||||||
|
|
||||||
|
|
||||||
|
@interface BoxLayer : CAShapeLayer
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 绘制OCR的结果
|
||||||
|
*/
|
||||||
|
-(void) renderOcrPolygon: (OcrData *)data withHeight:(CGFloat)originHeight withWidth:(CGFloat)originWidth withLabel:(bool) withLabel;
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@end
|
|
@ -0,0 +1,80 @@
|
||||||
|
//
|
||||||
|
// Created by chenxiaoyu on 2018/5/5.
|
||||||
|
// Copyright (c) 2018 baidu. All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#include "BoxLayer.h"
|
||||||
|
#import "Helpers.h"
|
||||||
|
|
||||||
|
@implementation BoxLayer {
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
#define MAIN_COLOR UIColorFromRGB(0x3B85F5)
|
||||||
|
- (void)renderOcrPolygon:(OcrData *)d withHeight:(CGFloat)originHeight withWidth:(CGFloat)originWidth withLabel:(bool)withLabel {
|
||||||
|
|
||||||
|
if ([d.polygonPoints count] != 4) {
|
||||||
|
NSLog(@"poloygonPoints size is not 4");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
CGPoint startPoint = [d.polygonPoints[0] CGPointValue];
|
||||||
|
NSString *text = d.label;
|
||||||
|
|
||||||
|
CGFloat x = startPoint.x * originWidth;
|
||||||
|
CGFloat y = startPoint.y * originHeight;
|
||||||
|
CGFloat width = originWidth - x;
|
||||||
|
CGFloat height = originHeight - y;
|
||||||
|
|
||||||
|
|
||||||
|
UIFont *font = [UIFont systemFontOfSize:16];
|
||||||
|
NSDictionary *attrs = @{
|
||||||
|
// NSStrokeColorAttributeName: [UIColor blackColor],
|
||||||
|
NSForegroundColorAttributeName: [UIColor whiteColor],
|
||||||
|
// NSStrokeWidthAttributeName : @((float) -6.0),
|
||||||
|
NSFontAttributeName: font
|
||||||
|
};
|
||||||
|
|
||||||
|
|
||||||
|
if (withLabel) {
|
||||||
|
NSAttributedString *displayStr = [[NSAttributedString alloc] initWithString:text attributes:attrs];
|
||||||
|
CATextLayer *textLayer = [[CATextLayer alloc] init];
|
||||||
|
textLayer.wrapped = YES;
|
||||||
|
textLayer.string = displayStr;
|
||||||
|
textLayer.frame = CGRectMake(x + 2, y + 2, width, height);
|
||||||
|
textLayer.contentsScale = [[UIScreen mainScreen] scale];
|
||||||
|
|
||||||
|
// 加阴影显得有点乱
|
||||||
|
// textLayer.shadowColor = [MAIN_COLOR CGColor];
|
||||||
|
// textLayer.shadowOffset = CGSizeMake(2.0, 2.0);
|
||||||
|
// textLayer.shadowOpacity = 0.8;
|
||||||
|
// textLayer.shadowRadius = 0.0;
|
||||||
|
|
||||||
|
[self addSublayer:textLayer];
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
UIBezierPath *path = [UIBezierPath new];
|
||||||
|
|
||||||
|
|
||||||
|
[path moveToPoint:CGPointMake(startPoint.x * originWidth, startPoint.y * originHeight)];
|
||||||
|
for (NSValue *val in d.polygonPoints) {
|
||||||
|
CGPoint p = [val CGPointValue];
|
||||||
|
[path addLineToPoint:CGPointMake(p.x * originWidth, p.y * originHeight)];
|
||||||
|
}
|
||||||
|
[path closePath];
|
||||||
|
|
||||||
|
self.path = path.CGPath;
|
||||||
|
self.strokeColor = MAIN_COLOR.CGColor;
|
||||||
|
self.lineWidth = 2.0;
|
||||||
|
self.fillColor = [MAIN_COLOR colorWithAlphaComponent:0.2].CGColor;
|
||||||
|
self.lineJoin = kCALineJoinBevel;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
- (void)renderSingleBox:(OcrData *)data withHeight:(CGFloat)originHeight withWidth:(CGFloat)originWidth {
|
||||||
|
[self renderOcrPolygon:data withHeight:originHeight withWidth:originWidth withLabel:YES];
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@end
|
|
@ -0,0 +1,31 @@
|
||||||
|
//
|
||||||
|
// Helpers.h
|
||||||
|
// EasyDLDemo
|
||||||
|
//
|
||||||
|
// Created by chenxiaoyu on 2018/5/14.
|
||||||
|
// Copyright © 2018年 baidu. All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import <Foundation/Foundation.h>
|
||||||
|
#import <UIKit/UIImage.h>
|
||||||
|
|
||||||
|
#define UIColorFromRGB(rgbValue) \
|
||||||
|
[UIColor colorWithRed:((float)((rgbValue & 0xFF0000) >> 16))/255.0 \
|
||||||
|
green:((float)((rgbValue & 0x00FF00) >> 8))/255.0 \
|
||||||
|
blue:((float)((rgbValue & 0x0000FF) >> 0))/255.0 \
|
||||||
|
alpha:1.0]
|
||||||
|
|
||||||
|
#define SCREEN_HEIGHT [UIScreen mainScreen].bounds.size.height
|
||||||
|
#define SCREEN_WIDTH [UIScreen mainScreen].bounds.size.width
|
||||||
|
|
||||||
|
#define HIGHLIGHT_COLOR UIColorFromRGB(0xF5A623)
|
||||||
|
|
||||||
|
//#define BTN_HIGHTLIGH_TEXT_COLOR UIColorFromRGB(0xF5A623)
|
||||||
|
|
||||||
|
|
||||||
|
@interface Helpers : NSObject {
|
||||||
|
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
@end
|
|
@ -0,0 +1,17 @@
|
||||||
|
//
|
||||||
|
// Helpers.m
|
||||||
|
// EasyDLDemo
|
||||||
|
//
|
||||||
|
// Created by chenxiaoyu on 2018/5/14.
|
||||||
|
// Copyright © 2018年 baidu. All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import "Helpers.h"
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@implementation Helpers
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@end
|
|
@ -0,0 +1,49 @@
|
||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||||
|
<plist version="1.0">
|
||||||
|
<dict>
|
||||||
|
<key>CFBundleDevelopmentRegion</key>
|
||||||
|
<string>$(DEVELOPMENT_LANGUAGE)</string>
|
||||||
|
<key>CFBundleDisplayName</key>
|
||||||
|
<string>$(PRODUCT_NAME)</string>
|
||||||
|
<key>CFBundleExecutable</key>
|
||||||
|
<string>$(EXECUTABLE_NAME)</string>
|
||||||
|
<key>CFBundleIdentifier</key>
|
||||||
|
<string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>
|
||||||
|
<key>CFBundleInfoDictionaryVersion</key>
|
||||||
|
<string>6.0</string>
|
||||||
|
<key>CFBundleName</key>
|
||||||
|
<string>$(PRODUCT_NAME)</string>
|
||||||
|
<key>CFBundlePackageType</key>
|
||||||
|
<string>APPL</string>
|
||||||
|
<key>CFBundleShortVersionString</key>
|
||||||
|
<string>0.1</string>
|
||||||
|
<key>CFBundleVersion</key>
|
||||||
|
<string>1</string>
|
||||||
|
<key>LSRequiresIPhoneOS</key>
|
||||||
|
<true/>
|
||||||
|
<key>NSCameraUsageDescription</key>
|
||||||
|
<string>for test</string>
|
||||||
|
<key>UILaunchStoryboardName</key>
|
||||||
|
<string>LaunchScreen</string>
|
||||||
|
<key>UIMainStoryboardFile</key>
|
||||||
|
<string>Main</string>
|
||||||
|
<key>UIRequiredDeviceCapabilities</key>
|
||||||
|
<array>
|
||||||
|
<string>armv7</string>
|
||||||
|
</array>
|
||||||
|
<key>UISupportedInterfaceOrientations</key>
|
||||||
|
<array>
|
||||||
|
<string>UIInterfaceOrientationPortrait</string>
|
||||||
|
<string>UIInterfaceOrientationLandscapeLeft</string>
|
||||||
|
<string>UIInterfaceOrientationLandscapeRight</string>
|
||||||
|
</array>
|
||||||
|
<key>UISupportedInterfaceOrientations~ipad</key>
|
||||||
|
<array>
|
||||||
|
<string>UIInterfaceOrientationPortrait</string>
|
||||||
|
<string>UIInterfaceOrientationPortraitUpsideDown</string>
|
||||||
|
<string>UIInterfaceOrientationLandscapeLeft</string>
|
||||||
|
<string>UIInterfaceOrientationLandscapeRight</string>
|
||||||
|
</array>
|
||||||
|
</dict>
|
||||||
|
</plist>
|
|
@ -0,0 +1,15 @@
|
||||||
|
//
|
||||||
|
// Created by Lv,Xiangxiang on 2020/7/11.
|
||||||
|
// Copyright (c) 2020 Li,Xiaoyang(SYS). All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import <Foundation/Foundation.h>
|
||||||
|
|
||||||
|
|
||||||
|
@interface OcrData : NSObject
|
||||||
|
@property(nonatomic, copy) NSString *label;
|
||||||
|
@property(nonatomic) int category;
|
||||||
|
@property(nonatomic) float accuracy;
|
||||||
|
@property(nonatomic) NSArray *polygonPoints;
|
||||||
|
|
||||||
|
@end
|
|
@ -0,0 +1,12 @@
|
||||||
|
//
|
||||||
|
// Created by Lv,Xiangxiang on 2020/7/11.
|
||||||
|
// Copyright (c) 2020 Li,Xiaoyang(SYS). All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import "OcrData.h"
|
||||||
|
|
||||||
|
|
||||||
|
@implementation OcrData {
|
||||||
|
|
||||||
|
}
|
||||||
|
@end
|
|
@ -0,0 +1,16 @@
|
||||||
|
//
|
||||||
|
// ViewController.h
|
||||||
|
// seg_demo
|
||||||
|
//
|
||||||
|
// Created by Li,Xiaoyang(SYS) on 2018/11/13.
|
||||||
|
// Copyright © 2018年 Li,Xiaoyang(SYS). All rights reserved.
|
||||||
|
//
|
||||||
|
|
||||||
|
#import <UIKit/UIKit.h>
|
||||||
|
|
||||||
|
|
||||||
|
@interface ViewController : UIViewController
|
||||||
|
|
||||||
|
|
||||||
|
@end
|
||||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue