2021-06-10 14:24:59 +08:00
|
|
|
|
# PaddleStructure
|
|
|
|
|
|
2021-06-18 12:55:44 +08:00
|
|
|
|
安装layoutparser
|
|
|
|
|
```sh
|
|
|
|
|
wget https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
|
|
|
|
|
pip3 install layoutparser-0.0.0-py3-none-any.whl
|
|
|
|
|
```
|
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
## 1. pipeline介绍
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
|
|
|
|
PaddleStructure 是一个用于复杂板式文字OCR的工具包,流程如下
|
|
|
|
|
![pipeline](../doc/table/pipeline.png)
|
|
|
|
|
|
|
|
|
|
在PaddleStructure中,图片会先经由layoutparser进行版面分析,在版面分析中,会对图片里的区域进行分类,根据根据类别进行对于的ocr流程。
|
|
|
|
|
|
|
|
|
|
目前layoutparser会输出五个类别:
|
|
|
|
|
1. Text
|
|
|
|
|
2. Title
|
|
|
|
|
3. Figure
|
|
|
|
|
4. List
|
|
|
|
|
5. Table
|
|
|
|
|
|
|
|
|
|
1-4类走传统的OCR流程,5走表格的OCR流程。
|
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
## 2. LayoutParser
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
2021-06-16 16:05:37 +08:00
|
|
|
|
[文档](layout/README.md)
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
## 3. Table OCR
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
|
|
|
|
[文档](table/README_ch.md)
|
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
## 4. PaddleStructure whl包介绍
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
### 4.1 使用
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
4.1.1 代码使用
|
2021-06-10 14:24:59 +08:00
|
|
|
|
```python
|
2021-06-23 12:28:32 +08:00
|
|
|
|
import os
|
2021-06-10 14:24:59 +08:00
|
|
|
|
import cv2
|
2021-06-23 12:28:32 +08:00
|
|
|
|
from paddlestructure import PaddleStructure,draw_result,save_res
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
2021-06-23 12:28:32 +08:00
|
|
|
|
table_engine = PaddleStructure(show_log=True)
|
2021-06-10 14:24:59 +08:00
|
|
|
|
|
2021-06-23 12:28:32 +08:00
|
|
|
|
save_folder = './output/table'
|
2021-06-10 14:24:59 +08:00
|
|
|
|
img_path = '../doc/table/1.png'
|
|
|
|
|
img = cv2.imread(img_path)
|
|
|
|
|
result = table_engine(img)
|
2021-06-23 12:28:32 +08:00
|
|
|
|
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])
|
|
|
|
|
|
2021-06-10 14:24:59 +08:00
|
|
|
|
for line in result:
|
|
|
|
|
print(line)
|
|
|
|
|
|
|
|
|
|
from PIL import Image
|
|
|
|
|
|
|
|
|
|
font_path = 'path/tp/PaddleOCR/doc/fonts/simfang.ttf'
|
|
|
|
|
image = Image.open(img_path).convert('RGB')
|
|
|
|
|
im_show = draw_result(image, result,font_path=font_path)
|
|
|
|
|
im_show = Image.fromarray(im_show)
|
|
|
|
|
im_show.save('result.jpg')
|
|
|
|
|
```
|
|
|
|
|
|
2021-06-11 14:17:59 +08:00
|
|
|
|
4.1.2 命令行使用
|
2021-06-10 14:24:59 +08:00
|
|
|
|
```bash
|
|
|
|
|
paddlestructure --image_dir=../doc/table/1.png
|
|
|
|
|
```
|
|
|
|
|
|
2021-06-10 17:17:46 +08:00
|
|
|
|
### 参数说明
|
|
|
|
|
大部分参数和paddleocr whl包保持一致,见 [whl包文档](../doc/doc_ch/whl.md)
|
|
|
|
|
|
|
|
|
|
| 字段 | 说明 | 默认值 |
|
|
|
|
|
|------------------------|------------------------------------------------------|------------------|
|
|
|
|
|
| output | excel和识别结果保存的地址 | ./output/table |
|
|
|
|
|
| structure_max_len | structure模型预测时,图像的长边resize尺度 | 488 |
|
|
|
|
|
| structure_model_dir | structure inference 模型地址 | None |
|
|
|
|
|
| structure_char_type | structure 模型所用字典地址 | ../ppocr/utils/dict/table_structure_dict.tx |
|
|
|
|
|
|
|
|
|
|
|