Merge pull request #1392 from tink2123/dygraph_doc
[Dygraph] update readme and polish some docs
This commit is contained in:
commit
71b8416493
|
@ -54,11 +54,10 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
|
||||||
|
|
||||||
| 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
|
| 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
|
||||||
| ------------ | --------------- | ----------------|---- | ---------- | -------- |
|
| ------------ | --------------- | ----------------|---- | ---------- | -------- |
|
||||||
| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[推理模型](link) / [预训练模型](link)|[推理模型](link) / [预训练模型](link) |[推理模型](link) / [预训练模型](link) |
|
| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v2.0_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
|
||||||
| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[推理模型](link) / [预训练模型](link) |[推理模型](link) / [预训练模型](link) |[推理模型](link) / [预训练模型](link) |
|
| 中英文通用OCR模型(143M) |ch_ppocr_server_v2.0_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
|
||||||
| 中英文超轻量压缩OCR模型(3.5M) | ch_ppocr_mobile_slim_v1.1_xx| 移动端 |[推理模型](link) / [slim模型](link) |[推理模型](link) / [slim模型](link)| [推理模型](link) / [slim模型](link)|
|
|
||||||
|
|
||||||
更多模型下载(包括多语言),可以参考[PP-OCR v1.1 系列模型下载](./doc/doc_ch/models_list.md)
|
更多模型下载(包括多语言),可以参考[PP-OCR v2.0 系列模型下载](./doc/doc_ch/models_list.md)
|
||||||
|
|
||||||
## 文档教程
|
## 文档教程
|
||||||
- [快速安装](./doc/doc_ch/installation.md)
|
- [快速安装](./doc/doc_ch/installation.md)
|
||||||
|
|
12
README_en.md
12
README_en.md
|
@ -62,15 +62,11 @@ Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Andr
|
||||||
|
|
||||||
| Model introduction | Model name | Recommended scene | Detection model | Direction classifier | Recognition model |
|
| Model introduction | Model name | Recommended scene | Detection model | Direction classifier | Recognition model |
|
||||||
| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
|
| ------------------------------------------------------------ | ---------------------------- | ----------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
|
||||||
| Chinese and English ultra-lightweight OCR model (8.1M) | ch_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](link) / [pre-trained model](link) | [inference model](link) / [pre-trained model](link) | [inference model](link) / [pre-trained model](link) |
|
| Chinese and English ultra-lightweight OCR model (8.1M) | ch_ppocr_mobile_v2.0_xx | Mobile & server |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
|
||||||
| Chinese and English general OCR model (155.1M) | ch_ppocr_server_v1.1_xx | Server | [inference model](link) / [pre-trained model](link) | [inference model](link) / [pre-trained model](link) | [inference model](link) / [pre-trained model](link) |
|
| Chinese and English general OCR model (143M) | ch_ppocr_server_v2.0_xx | Server |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_traingit.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
|
||||||
| Chinese and English ultra-lightweight compressed OCR model (3.5M) | ch_ppocr_mobile_slim_v1.1_xx | Mobile | [inference model](link) / [slim model](link) | [inference model](link) / [slim model](link) | [inference model](link) / [slim model](link) |
|
|
||||||
| French ultra-lightweight OCR model (4.6M) | french_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](link) / [pre-trained model](link) | - | [inference model](link) / [pre-trained model](link) |
|
|
||||||
| German ultra-lightweight OCR model (4.6M) | german_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](link) / [pre-trained model](link) | - |[inference model](link) / [pre-trained model](link) |
|
|
||||||
| Korean ultra-lightweight OCR model (5.9M) | korean_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](link) / [pre-trained model](link) | - |[inference model](link) / [pre-trained model](link)|
|
|
||||||
| Japan ultra-lightweight OCR model (6.2M) | japan_ppocr_mobile_v1.1_xx | Mobile & server | [inference model](link) / [pre-trained model](link) | - |[inference model](link) / [pre-trained model](link) |
|
|
||||||
|
|
||||||
For more model downloads (including multiple languages), please refer to [PP-OCR v1.1 series model downloads](./doc/doc_en/models_list_en.md).
|
|
||||||
|
For more model downloads (including multiple languages), please refer to [PP-OCR v2.0 series model downloads](./doc/doc_en/models_list_en.md).
|
||||||
|
|
||||||
For a new language request, please refer to [Guideline for new language_requests](#language_requests).
|
For a new language request, please refer to [Guideline for new language_requests](#language_requests).
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,97 @@
|
||||||
|
Global:
|
||||||
|
use_gpu: true
|
||||||
|
epoch_num: 72
|
||||||
|
log_smooth_window: 20
|
||||||
|
print_batch_step: 10
|
||||||
|
save_model_dir: ./output/rec/ic15/
|
||||||
|
save_epoch_step: 3
|
||||||
|
# evaluation is run every 2000 iterations
|
||||||
|
eval_batch_step: [0, 2000]
|
||||||
|
# if pretrained_model is saved in static mode, load_static_weights must set to True
|
||||||
|
cal_metric_during_train: True
|
||||||
|
pretrained_model:
|
||||||
|
checkpoints:
|
||||||
|
save_inference_dir:
|
||||||
|
use_visualdl: False
|
||||||
|
infer_img: doc/imgs_words_en/word_10.png
|
||||||
|
# for data or label process
|
||||||
|
character_dict_path: ppocr/utils/ic15_dict.txt
|
||||||
|
character_type: ch
|
||||||
|
max_text_length: 25
|
||||||
|
infer_mode: False
|
||||||
|
use_space_char: False
|
||||||
|
|
||||||
|
Optimizer:
|
||||||
|
name: Adam
|
||||||
|
beta1: 0.9
|
||||||
|
beta2: 0.999
|
||||||
|
lr:
|
||||||
|
learning_rate: 0.0005
|
||||||
|
regularizer:
|
||||||
|
name: 'L2'
|
||||||
|
factor: 0
|
||||||
|
|
||||||
|
Architecture:
|
||||||
|
model_type: rec
|
||||||
|
algorithm: CRNN
|
||||||
|
Transform:
|
||||||
|
Backbone:
|
||||||
|
name: ResNet
|
||||||
|
layers: 34
|
||||||
|
Neck:
|
||||||
|
name: SequenceEncoder
|
||||||
|
encoder_type: rnn
|
||||||
|
hidden_size: 256
|
||||||
|
Head:
|
||||||
|
name: CTCHead
|
||||||
|
fc_decay: 0
|
||||||
|
|
||||||
|
Loss:
|
||||||
|
name: CTCLoss
|
||||||
|
|
||||||
|
PostProcess:
|
||||||
|
name: CTCLabelDecode
|
||||||
|
|
||||||
|
Metric:
|
||||||
|
name: RecMetric
|
||||||
|
main_indicator: acc
|
||||||
|
|
||||||
|
Train:
|
||||||
|
dataset:
|
||||||
|
name: SimpleDataSet
|
||||||
|
data_dir: ./train_data/
|
||||||
|
label_file_list: ["./train_data/train_list.txt"]
|
||||||
|
transforms:
|
||||||
|
- DecodeImage: # load image
|
||||||
|
img_mode: BGR
|
||||||
|
channel_first: False
|
||||||
|
- CTCLabelEncode: # Class handling label
|
||||||
|
- RecResizeImg:
|
||||||
|
image_shape: [3, 32, 100]
|
||||||
|
- KeepKeys:
|
||||||
|
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
|
||||||
|
loader:
|
||||||
|
shuffle: True
|
||||||
|
batch_size_per_card: 256
|
||||||
|
drop_last: True
|
||||||
|
num_workers: 8
|
||||||
|
|
||||||
|
Eval:
|
||||||
|
dataset:
|
||||||
|
name: SimpleDataSet
|
||||||
|
data_dir: ./train_data/
|
||||||
|
label_file_list: ["./train_data/train_list.txt"]
|
||||||
|
transforms:
|
||||||
|
- DecodeImage: # load image
|
||||||
|
img_mode: BGR
|
||||||
|
channel_first: False
|
||||||
|
- CTCLabelEncode: # Class handling label
|
||||||
|
- RecResizeImg:
|
||||||
|
image_shape: [3, 32, 100]
|
||||||
|
- KeepKeys:
|
||||||
|
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
|
||||||
|
loader:
|
||||||
|
shuffle: False
|
||||||
|
drop_last: False
|
||||||
|
batch_size_per_card: 256
|
||||||
|
num_workers: 4
|
|
@ -37,8 +37,6 @@ ln -sf <path/to/dataset> <path/to/paddle_ocr>/train_data/dataset
|
||||||
|
|
||||||
若您本地没有数据集,可以在官网下载 [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads) 数据,用于快速验证。也可以参考[DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here),下载 benchmark 所需的lmdb格式数据集。
|
若您本地没有数据集,可以在官网下载 [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads) 数据,用于快速验证。也可以参考[DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here),下载 benchmark 所需的lmdb格式数据集。
|
||||||
|
|
||||||
如果希望复现SRN的论文指标,需要下载离线[增广数据](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA),提取码: y3ry。增广数据是由MJSynth和SynthText做旋转和扰动得到的。数据下载完成后请解压到 {your_path}/PaddleOCR/train_data/data_lmdb_release/training/ 路径下。
|
|
||||||
|
|
||||||
<a name="自定义数据集"></a>
|
<a name="自定义数据集"></a>
|
||||||
* 使用自己数据集
|
* 使用自己数据集
|
||||||
|
|
||||||
|
@ -65,7 +63,7 @@ wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_t
|
||||||
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt
|
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
PaddleOCR 也提供了数据格式转换脚本,可以将官网 label 转换支持的数据格式。 数据转换工具在 `train_data/gen_label.py`, 这里以训练集为例:
|
PaddleOCR 也提供了数据格式转换脚本,可以将官网 label 转换支持的数据格式。 数据转换工具在 `ppocr/utils/gen_label.py`, 这里以训练集为例:
|
||||||
|
|
||||||
```
|
```
|
||||||
# 将官网下载的标签文件转换为 rec_gt_label.txt
|
# 将官网下载的标签文件转换为 rec_gt_label.txt
|
||||||
|
@ -116,9 +114,9 @@ n
|
||||||
|
|
||||||
word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,“and” 将被映射成 [2 5 1]
|
word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,“and” 将被映射成 [2 5 1]
|
||||||
|
|
||||||
`ppocr/utils/ppocr_keys_v1.txt` 是一个包含6623个字符的中文字典,
|
`ppocr/utils/ppocr_keys_v1.txt` 是一个包含6623个字符的中文字典
|
||||||
|
|
||||||
`ppocr/utils/ic15_dict.txt` 是一个包含36个字符的英文字典,
|
`ppocr/utils/ic15_dict.txt` 是一个包含36个字符的英文字典
|
||||||
|
|
||||||
`ppocr/utils/dict/french_dict.txt` 是一个包含118个字符的法文字典
|
`ppocr/utils/dict/french_dict.txt` 是一个包含118个字符的法文字典
|
||||||
|
|
||||||
|
@ -128,6 +126,8 @@ word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,
|
||||||
|
|
||||||
`ppocr/utils/dict/german_dict.txt` 是一个包含131个字符的德文字典
|
`ppocr/utils/dict/german_dict.txt` 是一个包含131个字符的德文字典
|
||||||
|
|
||||||
|
`ppocr/utils/dict/en_dict.txt` 是一个包含63个字符的英文字典
|
||||||
|
|
||||||
|
|
||||||
您可以按需使用。
|
您可以按需使用。
|
||||||
|
|
||||||
|
@ -155,10 +155,10 @@ PaddleOCR提供了训练脚本、评估脚本和预测脚本,本节将以 CRNN
|
||||||
```
|
```
|
||||||
cd PaddleOCR/
|
cd PaddleOCR/
|
||||||
# 下载MobileNetV3的预训练模型
|
# 下载MobileNetV3的预训练模型
|
||||||
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar
|
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar
|
||||||
# 解压模型参数
|
# 解压模型参数
|
||||||
cd pretrain_models
|
cd pretrain_models
|
||||||
tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar
|
tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar
|
||||||
```
|
```
|
||||||
|
|
||||||
开始训练:
|
开始训练:
|
||||||
|
@ -204,9 +204,7 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t
|
||||||
| rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention |
|
| rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention |
|
||||||
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
|
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
|
||||||
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
|
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
|
||||||
| rec_r34_vd_tps_bilstm_attn.yml | RARE | Resnet34_vd | tps | BiLSTM | attention |
|
|
||||||
| rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc |
|
| rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc |
|
||||||
| rec_r50fpn_vd_none_srn.yml | SRN | Resnet50_fpn_vd | None | rnn | srn |
|
|
||||||
|
|
||||||
训练中文数据,推荐使用[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml),如您希望尝试其他算法在中文数据集上的效果,请参考下列说明修改配置文件:
|
训练中文数据,推荐使用[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml),如您希望尝试其他算法在中文数据集上的效果,请参考下列说明修改配置文件:
|
||||||
|
|
||||||
|
|
|
@ -120,6 +120,9 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a
|
||||||
|
|
||||||
`ppocr/utils/dict/german_dict.txt` is a German dictionary with 131 characters
|
`ppocr/utils/dict/german_dict.txt` is a German dictionary with 131 characters
|
||||||
|
|
||||||
|
`ppocr/utils/dict/en_dict.txt` is a English dictionary with 63 characters
|
||||||
|
|
||||||
|
|
||||||
You can use it on demand.
|
You can use it on demand.
|
||||||
|
|
||||||
The current multi-language model is still in the demo stage and will continue to optimize the model and add languages. **You are very welcome to provide us with dictionaries and fonts in other languages**,
|
The current multi-language model is still in the demo stage and will continue to optimize the model and add languages. **You are very welcome to provide us with dictionaries and fonts in other languages**,
|
||||||
|
@ -149,10 +152,10 @@ First download the pretrain model, you can download the trained model to finetun
|
||||||
```
|
```
|
||||||
cd PaddleOCR/
|
cd PaddleOCR/
|
||||||
# Download the pre-trained model of MobileNetV3
|
# Download the pre-trained model of MobileNetV3
|
||||||
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar
|
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar
|
||||||
# Decompress model parameters
|
# Decompress model parameters
|
||||||
cd pretrain_models
|
cd pretrain_models
|
||||||
tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar
|
tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar
|
||||||
```
|
```
|
||||||
|
|
||||||
Start training:
|
Start training:
|
||||||
|
@ -194,7 +197,6 @@ If the evaluation set is large, the test will be time-consuming. It is recommend
|
||||||
| rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention |
|
| rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention |
|
||||||
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
|
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
|
||||||
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
|
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
|
||||||
| rec_r34_vd_tps_bilstm_attn.yml | RARE | Resnet34_vd | tps | BiLSTM | attention |
|
|
||||||
| rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc |
|
| rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc |
|
||||||
|
|
||||||
For training Chinese data, it is recommended to use
|
For training Chinese data, it is recommended to use
|
||||||
|
|
BIN
doc/joinus.PNG
BIN
doc/joinus.PNG
Binary file not shown.
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 408 KiB |
|
@ -33,4 +33,4 @@ v
|
||||||
w
|
w
|
||||||
x
|
x
|
||||||
y
|
y
|
||||||
z
|
z
|
Loading…
Reference in New Issue