4.2 KiB
Executable File
> PaddleSlim 1.2.0 or higher version should be installed before runing this example.
Model compress tutorial (Quantization)
Compress results:
ID | Task | Model | Compress Strategy | Criterion(Chinese dataset) | Inference Time(ms) | Inference Time(Total model)(ms) | Acceleration Ratio | Model Size(MB) | Commpress Ratio | Download Link |
---|---|---|---|---|---|---|---|---|---|---|
0 | Detection | MobileNetV3_DB | None | 61.7 | 224 | 375 | - | 8.6 | - | |
Recognition | MobileNetV3_CRNN | None | 62.0 | 9.52 | ||||||
1 | Detection | SlimTextDet | PACT Quant Aware Training | 62.1 | 195 | 348 | 8% | 2.8 | 67.82% | |
Recognition | SlimTextRec | PACT Quant Aware Training | 61.48 | 8.6 | ||||||
2 | Detection | SlimTextDet_quat_pruning | Pruning+PACT Quant Aware Training | 60.86 | 142 | 288 | 30% | 2.8 | 67.82% | |
Recognition | SlimTextRec | PPACT Quant Aware Training | 61.48 | 8.6 | ||||||
3 | Detection | SlimTextDet_pruning | Pruning | 61.57 | 138 | 295 | 27% | 2.9 | 66.28% | |
Recognition | SlimTextRec | PACT Quant Aware Training | 61.48 | 8.6 |
Overview
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.
This example uses PaddleSlim provided APIs of Quantization to compress the OCR model.
It is recommended that you could understand following pages before reading this example,:
Install PaddleSlim
git clone https://github.com/PaddlePaddle/PaddleSlim.git
cd Paddleslim
python setup.py install
Download Pretrain Model
Download link of Detection pretrain model
Download link of recognization pretrain model
Quan-Aware Training
After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see:Model Quantization
Enter the PaddleOCR root directory,perform model quantization with the following command:
python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1
Export inference model
After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:
python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model