Merge pull request #1308 from WenmuZhou/update_angle_class_doc
Update angle class doc
This commit is contained in:
commit
eade2ce87a
|
@ -45,7 +45,7 @@ train_data/cls/word_002.jpg 180
|
|||
```
|
||||
|-train_data
|
||||
|-cls
|
||||
|- 和一个cls_gt_test.txt
|
||||
|- cls_gt_test.txt
|
||||
|- test
|
||||
|- word_001.jpg
|
||||
|- word_002.jpg
|
||||
|
@ -62,29 +62,36 @@ PaddleOCR提供了训练脚本、评估脚本和预测脚本。
|
|||
*如果您安装的是cpu版本,请将配置文件中的 `use_gpu` 字段修改为false*
|
||||
|
||||
```
|
||||
# 设置PYTHONPATH路径
|
||||
export PYTHONPATH=$PYTHONPATH:.
|
||||
# GPU训练 支持单卡,多卡训练,通过CUDA_VISIBLE_DEVICES指定卡号
|
||||
export CUDA_VISIBLE_DEVICES=0,1,2,3
|
||||
# 启动训练
|
||||
python3 tools/train.py -c configs/cls/cls_mv3.yml
|
||||
# GPU训练 支持单卡,多卡训练,通过selected_gpus指定卡号
|
||||
# 启动训练,下面的命令已经写入train.sh文件中,只需修改文件里的配置文件路径即可
|
||||
python3 -m paddle.distributed.launch --selected_gpus '0,1,2,3,4,5,6,7' tools/train.py -c configs/cls/cls_mv3.yml
|
||||
```
|
||||
|
||||
- 数据增强
|
||||
|
||||
PaddleOCR提供了多种数据增强方式,如果您希望在训练时加入扰动,请在配置文件中设置 `distort: true`。
|
||||
PaddleOCR提供了多种数据增强方式,如果您希望在训练时加入扰动,请在配置文件中取消`Train.dataset.transforms`下的`RecAug`和`RandAugment`字段的注释。
|
||||
|
||||
默认的扰动方式有:颜色空间转换(cvtColor)、模糊(blur)、抖动(jitter)、噪声(Gasuss noise)、随机切割(random crop)、透视(perspective)、颜色反转(reverse),随机数据增强(RandAugment)。
|
||||
|
||||
训练过程中除随机数据增强外每种扰动方式以50%的概率被选择,具体代码实现请参考:
|
||||
[randaugment.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/cls/randaugment.py)
|
||||
[img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py)
|
||||
[rec_img_aug.py](../../ppocr/data/imaug/rec_img_aug.py)
|
||||
[randaugment.py](../../ppocr/data/imaug/randaugment.py)
|
||||
|
||||
*由于OpenCV的兼容性问题,扰动操作暂时只支持linux*
|
||||
|
||||
### 训练
|
||||
|
||||
PaddleOCR支持训练和评估交替进行, 可以在 `configs/cls/cls_mv3.yml` 中修改 `eval_batch_step` 设置评估频率,默认每500个iter评估一次。评估过程中默认将最佳acc模型,保存为 `output/cls_mv3/best_accuracy` 。
|
||||
PaddleOCR支持训练和评估交替进行, 可以在 `configs/cls/cls_mv3.yml` 中修改 `eval_batch_step` 设置评估频率,默认每1000个iter评估一次。训练过程中将会保存如下内容:
|
||||
```bash
|
||||
├── best_accuracy.pdopt # 最佳模型的优化器参数
|
||||
├── best_accuracy.pdparams # 最佳模型的参数
|
||||
├── best_accuracy.states # 最佳模型的指标和epoch等信息
|
||||
├── config.yml # 本次实验的配置文件
|
||||
├── latest.pdopt # 最新模型的优化器参数
|
||||
├── latest.pdparams # 最新模型的参数
|
||||
├── latest.states # 最新模型的指标和epoch等信息
|
||||
└── train.log # 训练日志
|
||||
```
|
||||
|
||||
如果验证集很大,测试将会比较耗时,建议减少评估次数,或训练完再进行评估。
|
||||
|
||||
|
@ -92,9 +99,8 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/cls/cls_mv3.yml`
|
|||
|
||||
### 评估
|
||||
|
||||
评估数据集可以通过`configs/cls/cls_reader.yml` 修改EvalReader中的 `label_file_path` 设置。
|
||||
评估数据集可以通过修改`configs/cls/cls_mv3.yml`文件里的`Eval.dataset.label_file_list` 字段设置。
|
||||
|
||||
*注意* 评估时必须确保配置文件中 infer_img 字段为空
|
||||
```
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
# GPU 评估, Global.checkpoints 为待测权重
|
||||
|
@ -107,21 +113,20 @@ python3 tools/eval.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/
|
|||
|
||||
使用 PaddleOCR 训练好的模型,可以通过以下脚本进行快速预测。
|
||||
|
||||
默认预测图片存储在 `infer_img` 里,通过 `-o Global.checkpoints` 指定权重:
|
||||
通过 `Global.infer_img` 指定预测图片或文件夹路径,通过 `Global.checkpoints` 指定权重:
|
||||
|
||||
```
|
||||
# 预测分类结果
|
||||
python3 tools/infer_cls.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
|
||||
python3 tools/infer_cls.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/ch/word_1.jpg
|
||||
```
|
||||
|
||||
预测图片:
|
||||
|
||||
![](../imgs_words/en/word_1.png)
|
||||
![](../imgs_words/ch/word_1.jpg)
|
||||
|
||||
得到输入图像的预测结果:
|
||||
|
||||
```
|
||||
infer_img: doc/imgs_words/en/word_1.png
|
||||
scores: [[0.93161047 0.06838956]]
|
||||
label: [0]
|
||||
infer_img: doc/imgs_words/ch/word_1.jpg
|
||||
result: ('0', 0.9998784)
|
||||
```
|
||||
|
|
|
@ -65,26 +65,35 @@ Start training:
|
|||
```
|
||||
# Set PYTHONPATH path
|
||||
export PYTHONPATH=$PYTHONPATH:.
|
||||
# GPU training Support single card and multi-card training, specify the card number through CUDA_VISIBLE_DEVICES
|
||||
export CUDA_VISIBLE_DEVICES=0,1,2,3
|
||||
# Training icdar15 English data
|
||||
python3 tools/train.py -c configs/cls/cls_mv3.yml
|
||||
# GPU training Support single card and multi-card training, specify the card number through selected_gpus
|
||||
# Start training, the following command has been written into the train.sh file, just modify the configuration file path in the file
|
||||
python3 -m paddle.distributed.launch --selected_gpus '0,1,2,3,4,5,6,7' tools/train.py -c configs/cls/cls_mv3.yml
|
||||
```
|
||||
|
||||
- Data Augmentation
|
||||
|
||||
PaddleOCR provides a variety of data augmentation methods. If you want to add disturbance during training, please set `distort: true` in the configuration file.
|
||||
PaddleOCR provides a variety of data augmentation methods. If you want to add disturbance during training, Please uncomment the `RecAug` and `RandAugment` fields under `Train.dataset.transforms` in the configuration file.
|
||||
|
||||
The default perturbation methods are: cvtColor, blur, jitter, Gasuss noise, random crop, perspective, color reverse, RandAugment.
|
||||
|
||||
Except for RandAugment, each disturbance method is selected with a 50% probability during the training process. For specific code implementation, please refer to:
|
||||
[randaugment.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/cls/randaugment.py)
|
||||
[img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py)
|
||||
[rec_img_aug.py](../../ppocr/data/imaug/rec_img_aug.py)
|
||||
[randaugment.py](../../ppocr/data/imaug/randaugment.py)
|
||||
|
||||
|
||||
- Training
|
||||
|
||||
PaddleOCR supports alternating training and evaluation. You can modify `eval_batch_step` in `configs/cls/cls_mv3.yml` to set the evaluation frequency. By default, it is evaluated every 500 iter and the best acc model is saved under `output/cls_mv3/best_accuracy` during the evaluation process.
|
||||
PaddleOCR supports alternating training and evaluation. You can modify `eval_batch_step` in `configs/cls/cls_mv3.yml` to set the evaluation frequency. By default, it is evaluated every 1000 iter. The following content will be saved during training:
|
||||
```bash
|
||||
├── best_accuracy.pdopt # Optimizer parameters for the best model
|
||||
├── best_accuracy.pdparams # Parameters of the best model
|
||||
├── best_accuracy.states # Metric info and epochs of the best model
|
||||
├── config.yml # Configuration file for this experiment
|
||||
├── latest.pdopt # Optimizer parameters for the latest model
|
||||
├── latest.pdparams # Parameters of the latest model
|
||||
├── latest.states # Metric info and epochs of the latest model
|
||||
└── train.log # Training log
|
||||
```
|
||||
|
||||
If the evaluation set is large, the test will be time-consuming. It is recommended to reduce the number of evaluations, or evaluate after training.
|
||||
|
||||
|
@ -92,7 +101,7 @@ If the evaluation set is large, the test will be time-consuming. It is recommend
|
|||
|
||||
### EVALUATION
|
||||
|
||||
The evaluation data set can be modified via `configs/cls/cls_reader.yml` setting of `label_file_path` in EvalReader.
|
||||
The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/cls/cls_mv3.yml` file.
|
||||
|
||||
```
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
|
@ -106,21 +115,20 @@ python3 tools/eval.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/
|
|||
|
||||
Using the model trained by paddleocr, you can quickly get prediction through the following script.
|
||||
|
||||
The default prediction picture is stored in `infer_img`, and the weight is specified via `-o Global.checkpoints`:
|
||||
Use `Global.infer_img` to specify the path of the predicted picture or folder, and use `Global.checkpoints` to specify the weight:
|
||||
|
||||
```
|
||||
# Predict English results
|
||||
python3 tools/infer_rec.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy TestReader.infer_img=doc/imgs_words/en/word_1.jpg
|
||||
python3 tools/infer_cls.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words_en/word_10.png
|
||||
```
|
||||
|
||||
Input image:
|
||||
|
||||
![](../imgs_words/en/word_1.png)
|
||||
![](../imgs_words_en/word_10.png)
|
||||
|
||||
Get the prediction result of the input image:
|
||||
|
||||
```
|
||||
infer_img: doc/imgs_words/en/word_1.png
|
||||
scores: [[0.93161047 0.06838956]]
|
||||
label: [0]
|
||||
infer_img: doc/imgs_words_en/word_10.png
|
||||
result: ('0', 0.9999995)
|
||||
```
|
||||
|
|
Loading…
Reference in New Issue