Merge branch 'dygraph' into pgnet-postpro

This commit is contained in:
Double_V 2021-04-12 18:45:21 +08:00 committed by GitHub
commit ccb3373d10
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 149 additions and 5 deletions

View File

@ -5,6 +5,25 @@
- 2021.4.9 支持**80种**语言的检测和识别 - 2021.4.9 支持**80种**语言的检测和识别
- 2021.4.9 支持**轻量高精度**英文模型检测识别 - 2021.4.9 支持**轻量高精度**英文模型检测识别
PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库不仅提供了通用场景下的中英文模型也提供了专门在英文场景下训练的模型
和覆盖[80个语言](#语种缩写)的小语种模型。
其中英文模型支持,大小写字母和常见标点的检测识别,并优化了空格字符的识别:
<div align="center">
<img src="../imgs_results/multi_lang/en_1.jpg" width="400" height="600">
</div>
小语种模型覆盖了拉丁语系、阿拉伯语系、中文繁体、韩语、日语等等:
<div align="center">
<img src="../imgs_results/multi_lang/japan_2.jpg" width="600" height="300">
<img src="../imgs_results/multi_lang/french_0.jpg" width="300" height="300">
</div>
本文档将简要介绍小语种模型的使用方法。
- [1 安装](#安装) - [1 安装](#安装)
- [1.1 paddle 安装](#paddle安装) - [1.1 paddle 安装](#paddle安装)
- [1.2 paddleocr package 安装](#paddleocr_package_安装) - [1.2 paddleocr package 安装](#paddleocr_package_安装)
@ -68,7 +87,11 @@ Paddleocr目前支持80个语种可以通过修改--lang参数进行切换
paddleocr --image_dir doc/imgs/japan_2.jpg --lang=japan paddleocr --image_dir doc/imgs/japan_2.jpg --lang=japan
``` ```
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs/japan_2.jpg)
<div align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs/japan_2.jpg" width="800">
</div>
结果是一个list每个item包含了文本框文字和识别置信度 结果是一个list每个item包含了文本框文字和识别置信度
```text ```text
@ -138,8 +161,10 @@ im_show.save('result.jpg')
``` ```
结果可视化: 结果可视化:
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs_results/korean.jpg)
<div align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg" width="800">
</div>
* 识别预测 * 识别预测
@ -152,7 +177,8 @@ for line in result:
print(line) print(line)
``` ```
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs_words/german/1.jpg)
![](../imgs_words/german/1.jpg)
结果是一个tuple只包含识别结果和识别置信度 结果是一个tuple只包含识别结果和识别置信度
@ -187,7 +213,10 @@ im_show.save('result.jpg')
``` ```
结果可视化 结果可视化
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs_results/whl/12_det.jpg)
<div align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/whl/12_det.jpg" width="800">
</div>
ppocr 还支持方向分类, 更多使用方式请参考:[whl包使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md)。 ppocr 还支持方向分类, 更多使用方式请参考:[whl包使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md)。
@ -233,7 +262,7 @@ ppocr 支持使用自己的数据进行自定义训练或finetune, 其中识别
|卡纳达文|Kannada |kn| |卡纳达文|Kannada |kn|
|泰米尔文|Tamil |ta| |泰米尔文|Tamil |ta|
|南非荷兰文 |Afrikaans |af| |南非荷兰文 |Afrikaans |af|
|阿塞拜疆文 |Azerbaijani |az| |阿塞拜疆文 |Azerbaijani |az|
|波斯尼亚文|Bosnian|bs| |波斯尼亚文|Bosnian|bs|
|捷克文|Czech|cs| |捷克文|Czech|cs|
|威尔士文 |Welsh |cy| |威尔士文 |Welsh |cy|

View File

@ -5,6 +5,26 @@
-2021.4.9 supports the detection and recognition of 80 languages -2021.4.9 supports the detection and recognition of 80 languages
-2021.4.9 supports **lightweight high-precision** English model detection and recognition -2021.4.9 supports **lightweight high-precision** English model detection and recognition
PaddleOCR aims to create a rich, leading, and practical OCR tool library, which not only provides
Chinese and English models in general scenarios, but also provides models specifically trained
in English scenarios. And multilingual models covering [80 languages](#language_abbreviations).
Among them, the English model supports the detection and recognition of uppercase and lowercase
letters and common punctuation, and the recognition of space characters is optimized:
<div align="center">
<img src="../imgs_results/multi_lang/en_1.jpg" width="400" height="600">
</div>
The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japanese, etc.:
<div align="center">
<img src="../imgs_results/multi_lang/japan_2.jpg" width="600" height="300">
<img src="../imgs_results/multi_lang/french_0.jpg" width="300" height="300">
</div>
This document will briefly introduce how to use the multilingual model.
-[1 Installation](#Install) -[1 Installation](#Install)
-[1.1 paddle installation](#paddleinstallation) -[1.1 paddle installation](#paddleinstallation)
-[1.2 paddleocr package installation](#paddleocr_package_install) -[1.2 paddleocr package installation](#paddleocr_package_install)

Binary file not shown.

After

Width:  |  Height:  |  Size: 534 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 558 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 232 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 249 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 461 KiB

95
ppocr/utils/en_dict.txt Normal file
View File

@ -0,0 +1,95 @@
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
!
"
#
$
%
&
'
(
)
*
+
,
-
.
/