Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into update_notice_1215
|
@ -96,7 +96,7 @@ For a new language request, please refer to [Guideline for new language_requests
|
||||||
- [Benchmark](./doc/doc_en/benchmark_en.md)
|
- [Benchmark](./doc/doc_en/benchmark_en.md)
|
||||||
- Data Annotation and Synthesis
|
- Data Annotation and Synthesis
|
||||||
- [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
|
- [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
|
||||||
- [Data Synthesis Tool: Style_Edit](./StyleTextRec/README.md)
|
- [Data Synthesis Tool: Style-Text](./StyleText/README.md)
|
||||||
- [Other Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
|
- [Other Data Annotation Tools](./doc/doc_en/data_annotation_en.md)
|
||||||
- [Other Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
|
- [Other Data Synthesis Tools](./doc/doc_en/data_synthesis_en.md)
|
||||||
- Datasets
|
- Datasets
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
## Style Text
|
## Style-Text
|
||||||
|
|
||||||
### 目录
|
### 目录
|
||||||
- [一、工具简介](#工具简介)
|
- [一、工具简介](#工具简介)
|
||||||
|
@ -85,7 +85,7 @@ python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
#### 批量合成
|
#### 批量合成
|
||||||
在实际应用场景中,经常需要批量合成图片,补充到训练集中。StyleText可以使用一批风格图片和语料,批量合成数据。合成过程如下:
|
在实际应用场景中,经常需要批量合成图片,补充到训练集中。Style-Text可以使用一批风格图片和语料,批量合成数据。合成过程如下:
|
||||||
|
|
||||||
1. 在`configs/dataset_config.yml`中配置目标场景风格图像和语料的路径,具体如下:
|
1. 在`configs/dataset_config.yml`中配置目标场景风格图像和语料的路径,具体如下:
|
||||||
|
|
||||||
|
@ -100,7 +100,7 @@ python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_
|
||||||
* `language`:语料的语种;
|
* `language`:语料的语种;
|
||||||
* `corpus_file`: 语料文件路径。
|
* `corpus_file`: 语料文件路径。
|
||||||
|
|
||||||
StyleText也提供了一批中英韩5万张通用场景数据用作文本风格图像,便于合成场景丰富的文本图像,下图给出了一些示例。
|
Style-Text也提供了一批中英韩5万张通用场景数据用作文本风格图像,便于合成场景丰富的文本图像,下图给出了一些示例。
|
||||||
|
|
||||||
中英韩5万张通用场景数据: [下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/style_text/chkoen_5w.tar)
|
中英韩5万张通用场景数据: [下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/style_text/chkoen_5w.tar)
|
||||||
|
|
||||||
|
@ -116,7 +116,7 @@ python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_
|
||||||
|
|
||||||
<a name="应用案例"></a>
|
<a name="应用案例"></a>
|
||||||
### 四、应用案例
|
### 四、应用案例
|
||||||
下面以金属表面英文数字识别和通用韩语识别两个场景为例,说明使用StyleText合成数据,来提升文本识别效果的实际案例。下图给出了一些真实场景图像和合成图像的示例:
|
下面以金属表面英文数字识别和通用韩语识别两个场景为例,说明使用Style-Text合成数据,来提升文本识别效果的实际案例。下图给出了一些真实场景图像和合成图像的示例:
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="doc/images/6.png" width="800">
|
<img src="doc/images/6.png" width="800">
|
||||||
|
@ -134,38 +134,38 @@ python3 -m tools.synth_image -c configs/config.yml --style_image examples/style_
|
||||||
### 五、代码结构
|
### 五、代码结构
|
||||||
```
|
```
|
||||||
style_text_rec
|
style_text_rec
|
||||||
|-- arch
|
|-- arch // 网络结构定义文件
|
||||||
| |-- base_module.py
|
| |-- base_module.py
|
||||||
| |-- decoder.py
|
| |-- decoder.py
|
||||||
| |-- encoder.py
|
| |-- encoder.py
|
||||||
| |-- spectral_norm.py
|
| |-- spectral_norm.py
|
||||||
| `-- style_text_rec.py
|
| `-- style_text_rec.py
|
||||||
|-- configs
|
|-- configs // 配置文件
|
||||||
| |-- config.yml
|
| |-- config.yml
|
||||||
| `-- dataset_config.yml
|
| `-- dataset_config.yml
|
||||||
|-- engine
|
|-- engine // 数据合成引擎
|
||||||
| |-- corpus_generators.py
|
| |-- corpus_generators.py // 从文本采样或随机生成语料
|
||||||
| |-- predictors.py
|
| |-- predictors.py // 调用网络生成数据
|
||||||
| |-- style_samplers.py
|
| |-- style_samplers.py // 采样风格图片
|
||||||
| |-- synthesisers.py
|
| |-- synthesisers.py // 调度各个模块,合成数据
|
||||||
| |-- text_drawers.py
|
| |-- text_drawers.py // 生成标准文字图片,用作输入
|
||||||
| `-- writers.py
|
| `-- writers.py // 将合成的图片和标签写入本地目录
|
||||||
|-- examples
|
|-- examples // 示例文件
|
||||||
| |-- corpus
|
| |-- corpus
|
||||||
| | `-- example.txt
|
| | `-- example.txt
|
||||||
| |-- image_list.txt
|
| |-- image_list.txt
|
||||||
| `-- style_images
|
| `-- style_images
|
||||||
| |-- 1.jpg
|
| |-- 1.jpg
|
||||||
| `-- 2.jpg
|
| `-- 2.jpg
|
||||||
|-- fonts
|
|-- fonts // 字体文件
|
||||||
| |-- ch_standard.ttf
|
| |-- ch_standard.ttf
|
||||||
| |-- en_standard.ttf
|
| |-- en_standard.ttf
|
||||||
| `-- ko_standard.ttf
|
| `-- ko_standard.ttf
|
||||||
|-- tools
|
|-- tools // 程序入口
|
||||||
| |-- __init__.py
|
| |-- __init__.py
|
||||||
| |-- synth_dataset.py
|
| |-- synth_dataset.py // 批量合成数据
|
||||||
| `-- synth_image.py
|
| `-- synth_image.py // 合成单张图片
|
||||||
`-- utils
|
`-- utils // 其他基础功能模块
|
||||||
|-- config.py
|
|-- config.py
|
||||||
|-- load_params.py
|
|-- load_params.py
|
||||||
|-- logging.py
|
|-- logging.py
|
||||||
|
|
After Width: | Height: | Size: 192 KiB |
After Width: | Height: | Size: 127 KiB |
After Width: | Height: | Size: 126 KiB |
Before Width: | Height: | Size: 2.2 KiB After Width: | Height: | Size: 2.6 KiB |
Before Width: | Height: | Size: 1.8 KiB After Width: | Height: | Size: 1.5 KiB |
Before Width: | Height: | Size: 1.3 KiB After Width: | Height: | Size: 2.4 KiB |
After Width: | Height: | Size: 154 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.3 KiB |
|
@ -20,7 +20,7 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
|
||||||
|
|
||||||
|model name|description|config|model size|download|
|
|model name|description|config|model size|download|
|
||||||
| --- | --- | --- | --- | --- |
|
| --- | --- | --- | --- | --- |
|
||||||
|ch_ppocr_mobile_slim_v2.0_det|Slim pruned lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)| |[inference model (coming soon)](link) / [slim model (coming soon)](link)|
|
|ch_ppocr_mobile_slim_v2.0_det|Slim pruned lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)| |inference model (coming soon) / slim model (coming soon)|
|
||||||
|ch_ppocr_mobile_v2.0_det|Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|
|
|ch_ppocr_mobile_v2.0_det|Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|
|
||||||
|ch_ppocr_server_v2.0_det|General model, which is larger than the lightweight model, but achieved better performance|[ch_det_res18_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml)|47M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)|
|
|ch_ppocr_server_v2.0_det|General model, which is larger than the lightweight model, but achieved better performance|[ch_det_res18_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml)|47M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)|
|
||||||
|
|
||||||
|
@ -32,7 +32,7 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
|
||||||
|
|
||||||
|model name|description|config|model size|download|
|
|model name|description|config|model size|download|
|
||||||
| --- | --- | --- | --- | --- |
|
| --- | --- | --- | --- | --- |
|
||||||
|ch_ppocr_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)| |[inference model (coming soon)](link) / [slim model (coming soon)](link) |
|
|ch_ppocr_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)| |inference model (coming soon) / slim model (coming soon) |
|
||||||
|ch_ppocr_mobile_v2.0_rec|Original lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)|3.71M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
|
|ch_ppocr_mobile_v2.0_rec|Original lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)|3.71M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
|
||||||
|ch_ppocr_server_v2.0_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
|
|ch_ppocr_server_v2.0_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
|
||||||
|
|
||||||
|
@ -44,7 +44,7 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
|
||||||
|
|
||||||
|model name|description|config|model size|download|
|
|model name|description|config|model size|download|
|
||||||
| --- | --- | --- | --- | --- |
|
| --- | --- | --- | --- | --- |
|
||||||
|en_number_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)| |[inference model (coming soon )](link) / [slim model (coming soon)](link) |
|
|en_number_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)| |inference model (coming soon ) / slim model (coming soon) |
|
||||||
|en_number_mobile_v2.0_rec|Original lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)|2.56M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_train.tar) |
|
|en_number_mobile_v2.0_rec|Original lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)|2.56M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_train.tar) |
|
||||||
|
|
||||||
<a name="Multilingual"></a>
|
<a name="Multilingual"></a>
|
||||||
|
@ -62,6 +62,6 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
|
||||||
|
|
||||||
|model name|description|config|model size|download|
|
|model name|description|config|model size|download|
|
||||||
| --- | --- | --- | --- | --- |
|
| --- | --- | --- | --- | --- |
|
||||||
|ch_ppocr_mobile_slim_v2.0_cls|Slim quantized model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)| |[inference model (coming soon)](link) / [trained model](link) / [slim model](link) |
|
|ch_ppocr_mobile_slim_v2.0_cls|Slim quantized model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)| |inference model (coming soon) / trained model / slim model|
|
||||||
|ch_ppocr_mobile_v2.0_cls|Original model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|1.38M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |
|
|ch_ppocr_mobile_v2.0_cls|Original model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|1.38M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |
|
||||||
|
|
||||||
|
|