PaddleOCR/doc/doc_en/config_en.md

13 KiB

Optional parameter list

The following list can be viewed through --help

FLAG Supported script Use Defaults Note
-c ALL Specify configuration file to use None Please refer to the parameter introduction for configuration file usage
-o ALL set configuration options None Configuration using -o has higher priority than the configuration file selected with -c. E.g: -o Global.use_gpu=false

INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE

Take rec_chinese_lite_train_v2.0.yml as an example

Global

Parameter Use Defaults Note
use_gpu Set using GPU or not true \
epoch_num Maximum training epoch number 500 \
log_smooth_window Log queue length, the median value in the queue each time will be printed 20 \
print_batch_step Set print log interval 10 \
save_model_dir Set model save path output/{算法名称} \
save_epoch_step Set model save interval 3 \
eval_batch_step Set the model evaluation interval 2000 or [1000, 2000] runing evaluation every 2000 iters or evaluation is run every 2000 iterations after the 1000th iteration
cal_metric_during_train Set whether to evaluate the metric during the training process. At this time, the metric of the model under the current batch is evaluated true \
load_static_weights Set whether the pre-training model is saved in static graph mode (currently only required by the detection algorithm) true \
pretrained_model Set the path of the pre-trained model ./pretrain_models/CRNN/best_accuracy \
checkpoints set model parameter path None Used to load parameters after interruption to continue training
use_visualdl Set whether to enable visualdl for visual log display False Tutorial
infer_img Set inference image path or folder path ./infer_img \
character_dict_path Set dictionary path ./ppocr/utils/ppocr_keys_v1.txt \
max_text_length Set the maximum length of text 25 \
character_type Set character type ch en/ch, the default dict will be used for en, and the custom dict will be used for ch
use_space_char Set whether to recognize spaces True Only support in character_type=ch mode
label_list Set the angle supported by the direction classifier ['0','180'] Only valid in angle classifier model
save_res_path Set the save address of the test model results ./output/det_db/predicts_db.txt Only valid in the text detection model

Optimizer (ppocr/optimizer)

Parameter Use Defaults Note
name Optimizer class name Adam Currently supportsMomentum,Adam,RMSProp, see ppocr/optimizer/optimizer.py
beta1 Set the exponential decay rate for the 1st moment estimates 0.9 \
beta2 Set the exponential decay rate for the 2nd moment estimates 0.999 \
clip_norm The maximum norm value - \
lr Set the learning rate decay method - \
name Learning rate decay class name Cosine Currently supportsLinear,Cosine,Step,Piecewise, seeppocr/optimizer/learning_rate.py
learning_rate Set the base learning rate 0.001 \
regularizer Set network regularization method - \
name Regularizer class name L2 Currently supportL1,L2, seeppocr/optimizer/regularizer.py
factor Learning rate decay coefficient 0.00004 \

Architecture (ppocr/modeling)

In ppocr, the network is divided into four stages: Transform, Backbone, Neck and Head

Parameter Use Defaults Note
model_type Network Type rec Currently supportrec,det,cls
algorithm Model name CRNN See algorithm_overview for the support list
Transform Set the transformation method - Currently only recognition algorithms are supported, see ppocr/modeling/transform for details
name Transformation class name TPS Currently supports TPS
num_fiducial Number of TPS control points 20 Ten on the top and bottom
loc_lr Localization network learning rate 0.1 \
model_name Localization network size small Currently supportsmall,large
Backbone Set the network backbone class name - see ppocr/modeling/backbones
name backbone class name ResNet Currently supportMobileNetV3,ResNet
layers resnet layers 34 Currently support18,34,50,101,152,200
model_name MobileNetV3 network size small Currently supportsmall,large
Neck Set network neck - seeppocr/modeling/necks
name neck class name SequenceEncoder Currently supportSequenceEncoder,DBFPN
encoder_type SequenceEncoder encoder type rnn Currently supportreshape,fc,rnn
hidden_size rnn number of internal units 48 \
out_channels Number of DBFPN output channels 256 \
Head Set the network head - seeppocr/modeling/heads
name head class name CTCHead Currently supportCTCHead,DBHead,ClsHead
fc_decay CTCHead regularization coefficient 0.0004 \
k DBHead binarization coefficient 50 \
class_dim ClsHead output category number 2 \

Loss (ppocr/losses)

Parameter Use Defaults Note
name loss class name CTCLoss Currently supportCTCLoss,DBLoss,ClsLoss
balance_loss Whether to balance the number of positive and negative samples in DBLossloss (using OHEM) True \
ohem_ratio The negative and positive sample ratio of OHEM in DBLossloss 3 \
main_loss_type The loss used by shrink_map in DBLossloss DiceLoss Currently supportDiceLoss,BCELoss
alpha The coefficient of shrink_map_loss in DBLossloss 5 \
beta The coefficient of threshold_map_loss in DBLossloss 10 \

PostProcess (ppocr/postprocess)

Parameter Use Defaults Note
name Post-processing class name CTCLabelDecode Currently supportCTCLoss,AttnLabelDecode,DBPostProcess,ClsPostProcess
thresh The threshold for binarization of the segmentation map in DBPostProcess 0.3 \
box_thresh The threshold for filtering output boxes in DBPostProcess. Boxes below this threshold will not be output 0.7 \
max_candidates The maximum number of text boxes output in DBPostProcess 1000
unclip_ratio The unclip ratio of the text box in DBPostProcess 2.0 \

Metric (ppocr/metrics)

Parameter Use Defaults Note
name Metric method name CTCLabelDecode Currently supportDetMetric,RecMetric,ClsMetric
main_indicator Main indicators, used to select the best model acc For the detection method is hmean, the recognition and classification method is acc

Dataset (ppocr/data)

Parameter Use Defaults Note
dataset Return one sample per iteration - -
name dataset class name SimpleDataSet Currently supportSimpleDataSet,LMDBDateSet
data_dir Image folder path ./train_data \
label_file_list Groundtruth file path ["./train_data/train_list.txt"] This parameter is not required when dataset is LMDBDateSet
ratio_list Ratio of data set [1.0] If there are two train_lists in label_file_list and ratio_list is [0.4,0.6], 40% will be sampled from train_list1, and 60% will be sampled from train_list2 to combine the entire dataset
transforms List of methods to transform images and labels [DecodeImage,CTCLabelEncode,RecResizeImg,KeepKeys] seeppocr/data/imaug
loader dataloader related -
shuffle Does each epoch disrupt the order of the data set True \
batch_size_per_card Single card batch size during training 256 \
drop_last Whether to discard the last incomplete mini-batch because the number of samples in the data set cannot be divisible by batch_size True \
num_workers The number of sub-processes used to load data, if it is 0, the sub-process is not started, and the data is loaded in the main process 8 \