4.2 KiB

Executable File

Raw Blame History

> PaddleSlim 1.2.0 or higher version should be installed before runing this example.

Model compress tutorial (Quantization)

Compress results：

ID	Task	Model	Compress Strategy	Criterion(Chinese dataset)	Inference Time(ms)	Inference Time(Total model)(ms)	Acceleration Ratio	Model Size(MB)	Commpress Ratio
0	Detection	MobileNetV3_DB	None	61.7	224	375	-	8.6	-
0	Recognition	MobileNetV3_CRNN	None	62.0	9.52	375	-	8.6	-
1	Detection	SlimTextDet	PACT Quant Aware Training	62.1	195	348	8%	2.8	67.82%
1	Recognition	SlimTextRec	PACT Quant Aware Training	61.48	8.6	348	8%	2.8	67.82%
2	Detection	SlimTextDet_quat_pruning	Pruning+PACT Quant Aware Training	60.86	142	288	30%	2.8	67.82%
2	Recognition	SlimTextRec	PPACT Quant Aware Training	61.48	8.6	288	30%	2.8	67.82%
3	Detection	SlimTextDet_pruning	Pruning	61.57	138	295	27%	2.9	66.28%
3	Recognition	SlimTextRec	PACT Quant Aware Training	61.48	8.6	295	27%	2.9	66.28%

Overview

Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancyby reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.

This example uses PaddleSlim provided APIs of Quantization to compress the OCR model.

It is recommended that you could understand following pages before reading this example,：

Install PaddleSlim

git clone https://github.com/PaddlePaddle/PaddleSlim.git

cd Paddleslim

python setup.py install

Download Pretrain Model

Download link of Detection pretrain model

Download link of recognization pretrain model

Quan-Aware Training

After loading the pre training model, the model can be quantified after defining the quantization strategy. For specific details of quantization method, see：Model Quantization

Enter the PaddleOCR root directory，perform model quantization with the following command：

python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1

Export inference model

After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:

python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model

4.2 KiB Executable File Raw Blame History Unescape Escape