Change uppercase to lowercase

Change uppercase to lowercase
This commit is contained in:
Leif 2021-09-13 09:35:40 +08:00
parent d41e8b9c18
commit c0f6489af5
10 changed files with 108 additions and 71 deletions

View File

@ -34,10 +34,10 @@ PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, w
pip3 install --upgrade pip
# If you have cuda9 or cuda10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu==2.0.0 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
# If you only have cpu on your machine, please run the following command to install
python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.

View File

@ -37,11 +37,11 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具内置P
pip3 install --upgrade pip
如果您的机器安装的是CUDA9或CUDA10请运行以下命令安装
python3 -m pip install paddlepaddle-gpu==2.0.0 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
如果您的机器是CPU请运行以下命令安装
python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。

View File

@ -1,4 +1,4 @@
# Server-side C++ inference
# Server-side C++ Inference
This chapter introduces the C++ deployment method of the PaddleOCR model, and the corresponding python predictive deployment method refers to [document](../../doc/doc_ch/inference.md).
C++ is better than python in terms of performance calculation. Therefore, in most CPU and GPU deployment scenarios, C++ deployment is mostly used.
@ -6,14 +6,14 @@ This section will introduce how to configure the C++ environment and complete it
PaddleOCR model deployment.
## 1. Prepare the environment
## 1. Prepare the Environment
### Environment
- Linux, docker is recommended.
### 1.1 Compile opencv
### 1.1 Compile OpenCV
* First of all, you need to download the source code compiled package in the Linux environment from the opencv official website. Taking opencv3.4.7 as an example, the download command is as follows.
@ -73,7 +73,7 @@ opencv3/
|-- share
```
### 1.2 Compile or download or the Paddle inference library
### 1.2 Compile or Download or the Paddle Inference Library
* There are 2 ways to obtain the Paddle inference library, described in detail below.
@ -136,7 +136,7 @@ build/paddle_inference_install_dir/
Among them, `paddle` is the Paddle library required for C++ prediction later, and `version.txt` contains the version information of the current inference library.
## 2. Compile and run the demo
## 2. Compile and Run the Demo
### 2.1 Export the inference model

View File

@ -2,16 +2,18 @@
PaddleOCR将一个算法分解为以下几个部分并对各部分进行模块化处理方便快速组合出新的算法。
* 数据加载和处理
* 网络
* 后处理
* 损失函数
* 指标评估
* 优化器
* [1. 数据加载和处理](#1)
* [2. 网络](#2)
* [3. 后处理](#3)
* [4. 损失函数](#4)
* [5. 指标评估](#5)
* [6. 优化器](#6)
下面将分别对每个部分进行介绍,并介绍如何在该部分里添加新算法所需模块。
## 数据加载和处理
<a name="1"></a>
## 1. 数据加载和处理
数据加载和处理由不同的模块(module)组成其完成了图片的读取、数据增强和label的制作。这一部分在[ppocr/data](../../ppocr/data)下。 各个文件及文件夹作用说明如下:
@ -64,7 +66,9 @@ transforms:
keep_keys: [ 'image', 'label' ] # dataloader will return list in this order
```
## 网络
<a name="2"></a>
## 2. 网络
网络部分完成了网络的组网操作PaddleOCR将网络划分为四部分这一部分在[ppocr/modeling](../../ppocr/modeling)下。 进入网络的数据将按照顺序(transforms->backbones->
necks->heads)依次通过这四个部分。
@ -123,7 +127,9 @@ Architecture:
args1: args1
```
## 后处理
<a name="3"></a>
## 3. 后处理
后处理实现解码网络输出获得文本框或者识别到的文字。这一部分在[ppocr/postprocess](../../ppocr/postprocess)下。
PaddleOCR内置了DB,EAST,SAST,CRNN和Attention等算法相关的后处理模块对于没有内置的组件可通过如下步骤添加:
@ -171,7 +177,9 @@ PostProcess:
args2: args2
```
## 损失函数
<a name="4"></a>
## 4. 损失函数
损失函数用于计算网络输出和label之间的距离。这一部分在[ppocr/losses](../../ppocr/losses)下。
PaddleOCR内置了DB,EAST,SAST,CRNN和Attention等算法相关的损失函数模块对于没有内置的模块可通过如下步骤添加:
@ -208,7 +216,9 @@ Loss:
args2: args2
```
## 指标评估
<a name="5"></a>
## 5. 指标评估
指标评估用于计算网络在当前batch上的性能。这一部分在[ppocr/metrics](../../ppocr/metrics)下。 PaddleOCR内置了检测分类和识别等算法相关的指标评估模块对于没有内置的模块可通过如下步骤添加:
@ -262,7 +272,9 @@ Metric:
main_indicator: acc
```
## 优化器
<a name="6"></a>
## 6. 优化器
优化器用于训练网络。优化器内部还包含了网络正则化和学习率衰减模块。 这一部分在[ppocr/optimizer](../../ppocr/optimizer)下。 PaddleOCR内置了`Momentum`,`Adam`
和`RMSProp`等常用的优化器模块,`Linear`,`Cosine`,`Step`和`Piecewise`等常用的正则化模块与`L1Decay`和`L2Decay`等常用的学习率衰减模块。

View File

@ -90,10 +90,10 @@ cd /path/to/ppocr_img
```
如需使用2.0模型,请指定参数`--version 2.0`paddleocr默认使用2.1模型。更多whl包使用可参考[whl包文档](./whl.md)
如需使用2.0模型,请指定参数`--version PP-OCR`paddleocr默认使用2.1模型(`--versioin PP-OCRv2`)。更多whl包使用可参考[whl包文档](./whl.md)
<a name="212"></a>
#### 2.1.2 多语言模型
Paddleocr目前支持80个语种可以通过修改`--lang`参数进行切换,对于英文模型,指定`--lang=en`。

View File

@ -1,13 +1,14 @@
# TEXT ANGLE CLASSIFICATION
# Text Angle Classification
- [Method Introduction](#method-introduction)
- [Data Preparation](#data-preparation)
- [Training](#training)
- [Evaluation](#evaluation)
- [Prediction](#prediction)
- [1. Method Introduction](#method-introduction)
- [2. Data Preparation](#data-preparation)
- [3. Training](#training)
- [4. Evaluation](#evaluation)
- [5. Prediction](#prediction)
<a name="method-introduction"></a>
## Method Introduction
## 1. Method Introduction
The angle classification is used in the scene where the image is not 0 degrees. In this scene, it is necessary to perform a correction operation on the text line detected in the picture. In the PaddleOCR system,
The text line image obtained after text detection is sent to the recognition model after affine transformation. At this time, only a 0 and 180 degree angle classification of the text is required, so the built-in PaddleOCR text angle classifier **only supports 0 and 180 degree classification**. If you want to support more angles, you can modify the algorithm yourself to support.
@ -16,7 +17,7 @@ Example of 0 and 180 degree data samples
![](../imgs_results/angle_class_example.jpg)
<a name="data-preparation"></a>
## Data Preparation
## 2. Data Preparation
Please organize the dataset as follows:
@ -72,7 +73,7 @@ containing all images (test) and a cls_gt_test.txt. The structure of the test se
| ...
```
<a name="training"></a>
## Training
## 3. Training
Write the prepared txt file and image folder path into the configuration file under the `Train/Eval.dataset.label_file_list` and `Train/Eval.dataset.data_dir` fields, the absolute path of the image consists of the `Train/Eval.dataset.data_dir` field and the image name recorded in the txt file.
PaddleOCR provides training scripts, evaluation scripts, and prediction scripts.
@ -117,7 +118,7 @@ If the evaluation set is large, the test will be time-consuming. It is recommend
**Note that the configuration file for prediction/evaluation must be consistent with the training.**
<a name="evaluation"></a>
## Evaluation
## 4. Evaluation
The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/cls/cls_mv3.yml` file.
@ -127,7 +128,7 @@ export CUDA_VISIBLE_DEVICES=0
python3 tools/eval.py -c configs/cls/cls_mv3.yml -o Global.checkpoints={path/to/weights}/best_accuracy
```
<a name="prediction"></a>
## Prediction
## 5. Prediction
* Training engine prediction

View File

@ -1,4 +1,12 @@
## Optional parameter list
# Configuration
- [1. Optional Parameter List](#1-optional-parameter-list)
- [2. Intorduction to Global Parameters of Configuration File](#2-intorduction-to-global-parameters-of-configuration-file)
- [3. Multilingual Config File Generation](#3-multilingual-config-file-generation)
<a name="1-optional-parameter-list"></a>
## 1. Optional Parameter List
The following list can be viewed through `--help`
@ -7,7 +15,9 @@ The following list can be viewed through `--help`
| -c | ALL | Specify configuration file to use | None | **Please refer to the parameter introduction for configuration file usage** |
| -o | ALL | set configuration options | None | Configuration using -o has higher priority than the configuration file selected with -c. E.g: -o Global.use_gpu=false |
## INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE
<a name="2-intorduction-to-global-parameters-of-configuration-file"></a>
## 2. Intorduction to Global Parameters of Configuration File
Take rec_chinese_lite_train_v2.0.yml as an example
### Global
@ -121,8 +131,9 @@ In PaddleOCR, the network is divided into four stages: Transform, Backbone, Neck
| drop_last | Whether to discard the last incomplete mini-batch because the number of samples in the data set cannot be divisible by batch_size | True | \ |
| num_workers | The number of sub-processes used to load data, if it is 0, the sub-process is not started, and the data is loaded in the main process | 8 | \ |
<a name="3-multilingual-config-file-generation"></a>
## 3. MULTILINGUAL CONFIG FILE GENERATION
## 3. Multilingual Config File Generation
PaddleOCR currently supports 80 (except Chinese) language recognition. A multi-language configuration file template is
provided under the path `configs/rec/multi_languages`: [rec_multi_language_lite_train.yml](../../configs/rec/multi_language/rec_multi_language_lite_train.yml)。

View File

@ -1,5 +1,5 @@
# Reasoning based on Python prediction engine
# Inference based on Python Prediction Engine
The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
@ -10,21 +10,21 @@ For more details, please refer to the document [Classification Framework](https:
Next, we first introduce how to convert a trained model into an inference model, and then we will introduce text detection, text recognition, angle class, and the concatenation of them based on inference model.
- [CONVERT TRAINING MODEL TO INFERENCE MODEL](#CONVERT)
- [Convert detection model to inference model](#Convert_detection_model)
- [Convert recognition model to inference model](#Convert_recognition_model)
- [Convert angle classification model to inference model](#Convert_angle_class_model)
- [1. Convert Training Model to Inference Model](#CONVERT)
- [1.1 Convert Detection Model to Inference Model](#Convert_detection_model)
- [1.2 Convert Recognition Model to Inference Model](#Convert_recognition_model)
- [1.3 Convert Angle Classification Model to Inference Model](#Convert_angle_class_model)
- [TEXT DETECTION MODEL INFERENCE](#DETECTION_MODEL_INFERENCE)
- [1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE](#LIGHTWEIGHT_DETECTION)
- [2. DB TEXT DETECTION MODEL INFERENCE](#DB_DETECTION)
- [3. EAST TEXT DETECTION MODEL INFERENCE](#EAST_DETECTION)
- [4. SAST TEXT DETECTION MODEL INFERENCE](#SAST_DETECTION)
- [2. Text Detection Model Inference](#DETECTION_MODEL_INFERENCE)
- [2.1 Lightweight Chinese Detection Model Inference](#LIGHTWEIGHT_DETECTION)
- [2.2 DB Text Detection Model Inference](#DB_DETECTION)
- [2.3 East Text Detection Model Inference](#EAST_DETECTION)
- [2.4 Sast Text Detection Model Inference](#SAST_DETECTION)
- [5. Multilingual model inference](#Multilingual model inference)
- [TEXT RECOGNITION MODEL INFERENCE](#RECOGNITION_MODEL_INFERENCE)
- [1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_RECOGNITION)
- [3. Text Recognition Model Inference](#RECOGNITION_MODEL_INFERENCE)
- [3.1 Lightweight Chinese Text Recognition Model Reference](#LIGHTWEIGHT_RECOGNITION)
- [2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE](#CTC-BASED_RECOGNITION)
- [3. SRN-BASED TEXT RECOGNITION MODEL INFERENCE](#SRN-BASED_RECOGNITION)
- [3. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY](#USING_CUSTOM_CHARACTERS)
@ -38,9 +38,9 @@ Next, we first introduce how to convert a trained model into an inference model,
- [2. OTHER MODELS](#OTHER_MODELS)
<a name="CONVERT"></a>
## CONVERT TRAINING MODEL TO INFERENCE MODEL
## 1. Convert Training Model to Inference Model
<a name="Convert_detection_model"></a>
### Convert detection model to inference model
### 1.1 Convert Detection Model to Inference Model
Download the lightweight Chinese detection model:
```
@ -67,7 +67,7 @@ inference/det_db/
```
<a name="Convert_recognition_model"></a>
### Convert recognition model to inference model
### 1.2 Convert Recognition Model to Inference Model
Download the lightweight Chinese recognition model:
```
@ -95,7 +95,7 @@ inference/det_db/
```
<a name="Convert_angle_class_model"></a>
### Convert angle classification model to inference model
### 1.3 Convert Angle Classification Model to Inference Model
Download the angle classification model:
```
@ -122,13 +122,13 @@ inference/det_db/
<a name="DETECTION_MODEL_INFERENCE"></a>
## TEXT DETECTION MODEL INFERENCE
## 2. Text Detection Model Inference
The following will introduce the lightweight Chinese detection model inference, DB text detection model inference and EAST text detection model inference. The default configuration is based on the inference setting of the DB text detection model.
Because EAST and DB algorithms are very different, when inference, it is necessary to **adapt the EAST text detection algorithm by passing in corresponding parameters**.
<a name="LIGHTWEIGHT_DETECTION"></a>
### 1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE
### 2.1 Lightweight Chinese Detection Model Inference
For lightweight Chinese detection model inference, you can execute the following commands:
@ -163,7 +163,7 @@ python3 tools/infer/predict_det.py --image_dir="./doc/imgs/1.jpg" --det_model_di
```
<a name="DB_DETECTION"></a>
### 2. DB TEXT DETECTION MODEL INFERENCE
### 2.2 DB Text Detection Model Inference
First, convert the model saved in the DB text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar)), you can use the following command to convert:
@ -184,7 +184,7 @@ The visualized text detection results are saved to the `./inference_results` fol
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images.
<a name="EAST_DETECTION"></a>
### 3. EAST TEXT DETECTION MODEL INFERENCE
### 2.3 EAST TEXT DETECTION MODEL INFERENCE
First, convert the model saved in the EAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar)), you can use the following command to convert:
@ -205,7 +205,7 @@ The visualized text detection results are saved to the `./inference_results` fol
<a name="SAST_DETECTION"></a>
### 4. SAST TEXT DETECTION MODEL INFERENCE
### 2.4 Sast Text Detection Model Inference
#### (1). Quadrangle text detection model (ICDAR2015)
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar)), you can use the following command to convert:
@ -243,13 +243,13 @@ The visualized text detection results are saved to the `./inference_results` fol
**Note**: SAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.
<a name="RECOGNITION_MODEL_INFERENCE"></a>
## TEXT RECOGNITION MODEL INFERENCE
## 3. Text Recognition Model Inference
The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inferencing. Please check below for details.
<a name="LIGHTWEIGHT_RECOGNITION"></a>
### 1. LIGHTWEIGHT CHINESE TEXT RECOGNITION MODEL REFERENCE
### 3.1 Lightweight Chinese Text Recognition Model Reference
For lightweight Chinese recognition model inference, you can execute the following commands:

View File

@ -5,7 +5,7 @@
+ [1. Install PaddleOCR Whl Package](#1-install-paddleocr-whl-package)
* [2. Easy-to-Use](#2-easy-to-use)
+ [2.1 Use by command line](#21-use-by-command-line)
+ [2.1 Use by Command Line](#21-use-by-command-line)
- [2.1.1 English and Chinese Model](#211-english-and-chinese-model)
- [2.1.2 Multi-language Model](#212-multi-language-model)
- [2.1.3 Layout Analysis](#213-layoutAnalysis)
@ -39,7 +39,7 @@ pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+
<a name="21-use-by-command-line"></a>
### 2.1 Use by command line
### 2.1 Use by Command Line
PaddleOCR provides a series of test images, click [here](https://paddleocr.bj.bcebos.com/dygraph_v2.1/ppocr_img.zip) to download, and then switch to the corresponding directory in the terminal
@ -95,7 +95,7 @@ If you do not use the provided test image, you can replace the following `--imag
['PAIN', 0.990372]
```
If you need to use the 2.0 model, please specify the parameter `--version 2.0`, paddleocr uses the 2.1 model by default. More whl package usage can be found in [whl package](./whl_en.md)
If you need to use the 2.0 model, please specify the parameter `--version PP-OCR`, paddleocr uses the 2.1 model by default(`--versioin PP-OCRv2`). More whl package usage can be found in [whl package](./whl_en.md)
<a name="212-multi-language-model"></a>
#### 2.1.2 Multi-language Model

View File

@ -1,6 +1,16 @@
# 表格识别
* [1. 表格识别 pipeline](#1)
* [2. 性能](#2)
* [3. 使用](#3)
+ [3.1 快速开始](#31)
+ [3.2 训练](#32)
+ [3.3 评估](#33)
+ [3.4 预测](#34)
<a name="1"></a>
## 1. 表格识别 pipeline
表格识别主要包含三个模型
1. 单行文本检测-DB
2. 单行文本识别-CRNN
@ -17,6 +27,8 @@
3. 由单行文字的坐标、识别结果和单元格的坐标一起组合出单元格的识别结果。
4. 单元格的识别结果和表格结构一起构造表格的html字符串。
<a name="2"></a>
## 2. 性能
我们在 PubTabNet<sup>[1]</sup> 评估数据集上对算法进行了评估,性能如下
@ -26,8 +38,9 @@
| EDD<sup>[2]</sup> | 88.3 |
| Ours | 93.32 |
<a name="3"></a>
## 3. 使用
<a name="31"></a>
### 3.1 快速开始
```python
@ -48,7 +61,7 @@ python3 table/predict_table.py --det_model_dir=inference/en_ppocr_mobile_v2.0_ta
运行完成后每张图片的excel表格会保存到output字段指定的目录下
note: 上述模型是在 PubLayNet 数据集上训练的表格识别模型,仅支持英文扫描场景,如需识别其他场景需要自己训练模型后替换 `det_model_dir`,`rec_model_dir`,`table_model_dir`三个字段即可。
<a name="32"></a>
### 3.2 训练
在这一章节中,我们仅介绍表格结构模型的训练,[文字检测](../../doc/doc_ch/detection.md)和[文字识别](../../doc/doc_ch/recognition.md)的模型训练请参考对应的文档。
@ -75,7 +88,7 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo
**注意**`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。
<a name="33"></a>
### 3.3 评估
表格使用 [TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src) 作为模型的评估指标。在进行模型评估之前需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好)还需要准备评估的gt gt示例如下:
@ -100,7 +113,7 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di
```bash
teds: 93.32
```
<a name="34"></a>
### 3.4 预测
```python