update rec_r31_sar.yml and sar docs

This commit is contained in:
andyjpaddle 2021-08-31 11:33:41 +00:00
parent 69f9bdd8f6
commit d43688a4da
3 changed files with 6 additions and 2 deletions

View File

@ -56,7 +56,7 @@ Train:
dataset:
name: SimpleDataSet
delimiter: ' '
label_file_list: ['/paddle/data/concat_data/icdar_2013_train20.txt', '/paddle/data/concat_data/icdar_2015_train20.txt', '/paddle/data/concat_data/coco_text_train20.txt', '/paddle/data/concat_data/IIIt5k_train20.txt', '/paddle/data/concat_data/SynthAdd_train.txt', '/paddle/data/concat_data/SynthText_train.txt', '/paddle/data/concat_data/Syn90k_train.txt']
label_file_list: ['/paddle/data/concat_data/train_list.txt']
data_dir: /paddle/data/concat_data/
ratio_list: 1.0
transforms:
@ -71,7 +71,7 @@ Train:
keep_keys: ['image', 'label', 'valid_ratio'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 64 # 32
batch_size_per_card: 64
drop_last: True
num_workers: 8
use_shared_memory: False

View File

@ -90,6 +90,8 @@ train_data/rec/train/word_002.jpg 用科技让复杂的世界更简单
如果希望复现SRN的论文指标需要下载离线[增广数据](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA),提取码: y3ry。增广数据是由MJSynth和SynthText做旋转和扰动得到的。数据下载完成后请解压到 {your_path}/PaddleOCR/train_data/data_lmdb_release/training/ 路径下。
如果希望复现SAR的论文指标需要下载[SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), 提取码627x。此外真实数据集icdar2013, icdar2015, cocotext, IIIT5也作为训练数据的一部分。具体数据细节可以参考论文SAR。
```
# 训练集标签
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_train.txt

View File

@ -90,6 +90,8 @@ If you do not have a dataset locally, you can download it on the official websit
If you want to reproduce the paper indicators of SRN, you need to download offline [augmented data](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA), extraction code: y3ry. The augmented data is obtained by rotation and perturbation of mjsynth and synthtext. Please unzip the data to {your_path}/PaddleOCR/train_data/data_lmdb_Release/training/path.
If you want to reproduce the paper SAR, you need to download extra dataset [SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), extraction code: 627x. Besides, icdar2013, icdar2015, cocotext, IIIT5k datasets are also used to train. For specific details, please refer to the paper SAR.
PaddleOCR provides label files for training the icdar2015 dataset, which can be downloaded in the following ways:
```