PaddleOCR/deploy/pdserving/readme_en.md

4.8 KiB
Raw Blame History

English | 简体中文

PaddleOCR provides 2 service deployment methods:

  • Based on PaddleHub Serving: Code path is "./deploy/hubserving". Please refer to the tutorial for usage.
  • Based on PaddleServing: Code path is "./deploy/pdserving". Please follow this tutorial.

Service deployment based on Paddle Serving

This tutorial will introduce the detail steps of deploying PaddleOCR online prediction service based on Paddle Serving.

Quick start service

1. Prepare the environment

Let's first install the relevant components of Paddle Serving. GPU is recommended for service deployment with Paddle Serving.

Requirements:

  • CUDA version: 9.0
  • CUDNN version: 7.0
  • Operating system version: >= CentOS 6
  • Python version 2.7/3.6/3.7

Installation

# install GPU server
python -m pip install paddle_serving_server_gpu

# or, install CPU server
python -m pip install paddle_serving_server

# install client and App package (CPU/GPU)
python -m pip install paddle_serving_app paddle_serving_client

2. Model transformation

You can directly use converted model provided by paddle_serving_app for convenience. Execute the following command to obtain:

python -m paddle_serving_app.package --get_model ocr_rec
tar -xzvf ocr_rec.tar.gz
python -m paddle_serving_app.package --get_model ocr_det
tar -xzvf ocr_det.tar.gz 

Executing the above command will download the db_crnn_mobile model, which is in different format with inference model. If you want to use other models for deployment, you can refer to the tutorial to convert your inference model to a model which is deployable for Paddle Serving.

We take ch_rec_r34_vd_crnn model as example. Download the inference model by executing the following command:

wget --no-check-certificate https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar
tar xf ch_rec_r34_vd_crnn_infer.tar

Convert the downloaded model by executing the following python script:

from paddle_serving_client.io import inference_model_to_serving
inference_model_dir = "ch_rec_r34_vd_crnn"
serving_client_dir = "serving_client_dir"
serving_server_dir = "serving_server_dir"
feed_var_names, fetch_var_names = inference_model_to_serving(
        inference_model_dir, serving_client_dir, serving_server_dir, model_filename="model", params_filename="params")

Finally, model configuration of client and server will be generated in serving_client_dir and serving_server_dir.

3. Start service

Start the standard version or the fast version service according to your actual needs. The comparison of the two versions is shown in the table below:

version characteristics recommended scenarios
standard version High stability, suitable for distributed deployment Large throughput and cross regional deployment
fast version Easy to deploy and fast to predict Suitable for scenarios which requires high prediction speed and fast iteration speed

Mode 1. Start the standard mode service

# start with CPU
python -m paddle_serving_server.serve --model ocr_det_model --port 9293 
python ocr_web_server.py cpu

# or, with GPU
python -m paddle_serving_server_gpu.serve --model ocr_det_model --port 9293 --gpu_id 0
python ocr_web_server.py gpu

Mode 2. Start the fast mode service

# start with CPU
python ocr_local_server.py cpu

# or, with GPU
python ocr_local_server.py gpu

Send prediction requests

python ocr_web_client.py

Returned result format

The returned result is a JSON string, eg.

{u'result': {u'res': [u'\u571f\u5730\u6574\u6cbb\u4e0e\u571f\u58e4\u4fee\u590d\u7814\u7a76\u4e2d\u5fc3', u'\u534e\u5357\u519c\u4e1a\u5927\u5b661\u7d20\u56fe']}}

You can also print the readable result in res:

土地整治与土壤修复研究中心
华南农业大学1素图

User defined service module modification

The pre-processing and post-processing process, can be found in the preprocess and postprocess function in ocr_web_server.py or ocr_local_server.py. The pre-processing/post-processing library for common CV models provided by paddle_serving_app is called. You can modify the corresponding code as actual needs.

If you only want to start the detection service or the recognition service, execute the corresponding script reffering to the following table. Indicate the CPU or GPU is used in the start command parameters.

task standard fast
detection det_web_server.py det_local_server.py
recognition rec_web_server.py rec_local_server.py

More info can be found in Paddle Serving.