Merge remote-tracking branch 'upstream/develop' into zxdev

This commit is contained in:
zhangxin 2020-06-11 13:46:31 +08:00
commit a65c721c22
2 changed files with 13 additions and 2 deletions

View File

@ -4,7 +4,7 @@
安装的paddle版本不对目前本项目仅支持paddle1.7近期会适配到1.8。 安装的paddle版本不对目前本项目仅支持paddle1.7近期会适配到1.8。
2. **转换attention识别模型时报错KeyError: 'predict'** 2. **转换attention识别模型时报错KeyError: 'predict'**
基于Attention损失的识别模型推理还在调试中。对于中文文本识别建议优先选择基于CTC损失的识别模型实践中也发现基于Attention损失的效果不如基于CTC损失的识别模型 问题已解决,请更新到最新代码
3. **关于推理速度** 3. **关于推理速度**
图片中的文字较多时,预测时间会增,可以使用--rec_batch_num设置更小预测batch num默认值为30可以改为10或其他数值。 图片中的文字较多时,预测时间会增,可以使用--rec_batch_num设置更小预测batch num默认值为30可以改为10或其他数值。
@ -41,3 +41,8 @@ PaddleOCR已完成Windows和Mac系统适配运行时注意两点1、在[
中文数据集LSVT街景数据集根据真值将图crop出来并进行位置校准总共30w张图像。此外基于LSVT的语料合成数据500w。 中文数据集LSVT街景数据集根据真值将图crop出来并进行位置校准总共30w张图像。此外基于LSVT的语料合成数据500w。
其中,公开数据集都是开源的,用户可自行搜索下载,也可参考[中文数据集](./datasets.md),合成数据暂不开源,用户可使用开源合成工具自行合成,可参考的合成工具包括[text_renderer](https://github.com/Sanster/text_renderer)、[SynthText](https://github.com/ankush-me/SynthText)、[TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator)等。 其中,公开数据集都是开源的,用户可自行搜索下载,也可参考[中文数据集](./datasets.md),合成数据暂不开源,用户可使用开源合成工具自行合成,可参考的合成工具包括[text_renderer](https://github.com/Sanster/text_renderer)、[SynthText](https://github.com/ankush-me/SynthText)、[TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator)等。
10. **使用带TPS的识别模型预测报错**
报错信息Input(X) dims[3] and Input(Grid) dims[2] should be equal, but received X dimension[3](320) != Grid dimension[2](100)
原因TPS模块暂时无法支持变长的输入请设置 --rec_image_shape='3,32,100' --rec_char_type='en' 固定输入shape

View File

@ -4,7 +4,7 @@
The installed version of paddle is incorrect. Currently, this project only supports paddle1.7, which will be adapted to 1.8 in the near future. The installed version of paddle is incorrect. Currently, this project only supports paddle1.7, which will be adapted to 1.8 in the near future.
2. **Error when converting attention recognition model: KeyError: 'predict'** 2. **Error when converting attention recognition model: KeyError: 'predict'**
The inference of recognition model based on attention loss is still in debugging. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss first. In practice, it is also found that the recognition model based on attention loss is not as effective as that based on CTC loss. Solved. Please update to the latest version of the code.
3. **About inference speed** 3. **About inference speed**
When there are many words in the picture, the prediction time will increase. You can use `--rec_batch_num` to set a smaller prediction batch num. The default value is 30, which can be changed to 10 or other values. When there are many words in the picture, the prediction time will increase. You can use `--rec_batch_num` to set a smaller prediction batch num. The default value is 30, which can be changed to 10 or other values.
@ -43,3 +43,9 @@ At present, the open source model, dataset and magnitude are as follows:
Chinese dataset: LSVT street view dataset with cropped text area, a total of 30w images. In addition, the synthesized data based on LSVT corpus is 500w. Chinese dataset: LSVT street view dataset with cropped text area, a total of 30w images. In addition, the synthesized data based on LSVT corpus is 500w.
Among them, the public datasets are opensourced, users can search and download by themselves, or refer to [Chinese data set](./datasets_en.md), synthetic data is not opensourced, users can use open-source synthesis tools to synthesize data themselves. Current available synthesis tools include [text_renderer](https://github.com/Sanster/text_renderer), [SynthText](https://github.com/ankush-me/SynthText), [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator), etc. Among them, the public datasets are opensourced, users can search and download by themselves, or refer to [Chinese data set](./datasets_en.md), synthetic data is not opensourced, users can use open-source synthesis tools to synthesize data themselves. Current available synthesis tools include [text_renderer](https://github.com/Sanster/text_renderer), [SynthText](https://github.com/ankush-me/SynthText), [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator), etc.
10. **Error in using the model with TPS module for prediction**
Error message: Input(X) dims[3] and Input(Grid) dims[2] should be equal, but received X dimension[3](108) != Grid dimension[2](100)
SolutionTPS does not support variable shape. Please set --rec_image_shape='3,32,100' and --rec_char_type='en'