PaddleOCR/doc/doc_en/add_new_algorithm_en.md

# Add new algorithm

PaddleOCR decomposes an algorithm into the following parts, and modularizes each part to make it more convenient to develop new algorithms.

* Data loading and processing
* Network
* Post-processing
* Loss
* Metric
* Optimizer

The following will introduce each part separately, and introduce how to add the modules required for the new algorithm.


## Data loading and processing

Data loading and processing are composed of different modules, which complete the image reading, data augment and label production. This part is under [ppocr/data](../../ppocr/data). The explanation of each file and folder are as follows:

```bash
ppocr/data/
├── imaug             # Scripts for image reading, data augment and label production
│   ├── label_ops.py  # Modules that transform the label
│   ├── operators.py  # Modules that transform the image
│   ├──.....
├── __init__.py
├── lmdb_dataset.py   # The dataset that reads the lmdb
└── simple_dataset.py # Read the dataset saved in the form of `image_path\tgt`
```

PaddleOCR has a large number of built-in image operation related modules. For modules that are not built-in, you can add them through the following steps:

1. Create a new file under the [ppocr/data/imaug](../../ppocr/data/imaug) folder, such as my_module.py.
2. Add code in the my_module.py file, the sample code is as follows:

```python
class MyModule:
    def __init__(self, *args, **kwargs):
        # your init code
        pass

    def __call__(self, data):
        img = data['image']
        label = data['label']
        # your process code

        data['image'] = img
        data['label'] = label
        return data
```

3. Import the added module in the [ppocr/data/imaug/\__init\__.py](../../ppocr/data/imaug/__init__.py) file.

All different modules of data processing are executed by sequence, combined and executed in the form of a list in the config file. Such as:

```yaml
# angle class data process
transforms:
  - DecodeImage: # load image
      img_mode: BGR
      channel_first: False
  - MyModule:
      args1: args1
      args2: args2
  - KeepKeys:
      keep_keys: [ 'image', 'label' ] # dataloader will return list in this order
```

## Network

The network part completes the construction of the network, and PaddleOCR divides the network into four parts, which are under [ppocr/modeling](../../ppocr/modeling). The data entering the network will pass through these four parts in sequence(transforms->backbones->
necks->heads).

```bash
├── architectures # Code for building network
├── transforms    # Image Transformation Module
├── backbones     # Feature extraction module
├── necks         # Feature enhancement module
└── heads         # Output module
```

PaddleOCR has built-in commonly used modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For modules that do not have built-in, you can add them through the following steps, the four parts are added in the same steps, take backbones as an example:

1. Create a new file under the [ppocr/modeling/backbones](../../ppocr/modeling/backbones) folder, such as my_backbone.py.
2. Add code in the my_backbone.py file, the sample code is as follows:

```python
import paddle
import paddle.nn as nn
import paddle.nn.functional as F


class MyBackbone(nn.Layer):
    def __init__(self, *args, **kwargs):
        super(MyBackbone, self).__init__()
        # your init code
        self.conv = nn.xxxx

    def forward(self, inputs):
        # your network forward
        y = self.conv(inputs)
        return y
```

3. Import the added module in the [ppocr/modeling/backbones/\__init\__.py](../../ppocr/modeling/backbones/__init__.py) file.

After adding the four-part modules of the network, you only need to configure them in the configuration file to use, such as:

```yaml
Architecture:
  model_type: rec
  algorithm: CRNN
  Transform:
    name: MyTransform
    args1: args1
    args2: args2
  Backbone:
    name: MyBackbone
    args1: args1
  Neck:
    name: MyNeck
    args1: args1
  Head:
    name: MyHead
    args1: args1
```

## Post-processing

Post-processing realizes decoding network output to obtain text box or recognized text. This part is under [ppocr/postprocess](../../ppocr/postprocess).
PaddleOCR has built-in post-processing modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For components that are not built-in, they can be added through the following steps:

1. Create a new file under the [ppocr/postprocess](../../ppocr/postprocess) folder, such as my_postprocess.py.
2. Add code in the my_postprocess.py file, the sample code is as follows:

```python
import paddle


class MyPostProcess:
    def __init__(self, *args, **kwargs):
        # your init code
        pass

    def __call__(self, preds, label=None, *args, **kwargs):
        if isinstance(preds, paddle.Tensor):
            preds = preds.numpy()
        # you preds decode code
        preds = self.decode_preds(preds)
        if label is None:
            return preds
        # you label decode code
        label = self.decode_label(label)
        return preds, label

    def decode_preds(self, preds):
        # you preds decode code
        pass

    def decode_label(self, preds):
        # you label decode code
        pass
```

3. Import the added module in the [ppocr/postprocess/\__init\__.py](../../ppocr/postprocess/__init__.py) file.

After the post-processing module is added, you only need to configure it in the configuration file to use, such as:

```yaml
PostProcess:
  name: MyPostProcess
  args1: args1
  args2: args2
```

## Loss

The loss function is used to calculate the distance between the network output and the label. This part is under [ppocr/losses](../../ppocr/losses).
PaddleOCR has built-in loss function modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For modules that do not have built-in modules, you can add them through the following steps:

1. Create a new file in the [ppocr/losses](../../ppocr/losses) folder, such as my_loss.py.
2. Add code in the my_loss.py file, the sample code is as follows:

```python
import paddle
from paddle import nn


class MyLoss(nn.Layer):
    def __init__(self, **kwargs):
        super(MyLoss, self).__init__()
        # you init code
        pass

    def __call__(self, predicts, batch):
        label = batch[1]
        # your loss code
        loss = self.loss(input=predicts, label=label)
        return {'loss': loss}
```

3. Import the added module in the [ppocr/losses/\__init\__.py](../../ppocr/losses/__init__.py) file.

After the loss function module is added, you only need to configure it in the configuration file to use it, such as:

```yaml
Loss:
  name: MyLoss
  args1: args1
  args2: args2
```

## Metric

Metric is used to calculate the performance of the network on the current batch. This part is under [ppocr/metrics](../../ppocr/metrics). PaddleOCR has built-in evaluation modules related to algorithms such as detection, classification and recognition. For modules that do not have built-in modules, you can add them through the following steps:

1. Create a new file under the [ppocr/metrics](../../ppocr/metrics) folder, such as my_metric.py.
2. Add code in the my_metric.py file, the sample code is as follows:

```python

class MyMetric(object):
    def __init__(self, main_indicator='acc', **kwargs):
        # main_indicator is used for select best model
        self.main_indicator = main_indicator
        self.reset()

    def __call__(self, preds, batch, *args, **kwargs):
        # preds is out of postprocess
        # batch is out of dataloader
        labels = batch[1]
        cur_correct_num = 0
        cur_all_num = 0
        # you metric code
        self.correct_num += cur_correct_num
        self.all_num += cur_all_num
        return {'acc': cur_correct_num / cur_all_num, }

    def get_metric(self):
        """
        return metircs {
                 'acc': 0,
                 'norm_edit_dis': 0,
            }
        """
        acc = self.correct_num / self.all_num
        self.reset()
        return {'acc': acc}

    def reset(self):
        # reset metric
        self.correct_num = 0
        self.all_num = 0

```

3. Import the added module in the [ppocr/metrics/\__init\__.py](../../ppocr/metrics/__init__.py) file.

After the metric module is added, you only need to configure it in the configuration file to use it, such as:

```yaml
Metric:
  name: MyMetric
  main_indicator: acc
```

## 优化器

The optimizer is used to train the network. The optimizer also contains network regularization and learning rate decay modules. This part is under [ppocr/optimizer](../../ppocr/optimizer). PaddleOCR has built-in
Commonly used optimizer modules such as `Momentum`, `Adam` and `RMSProp`, common regularization modules such as `Linear`, `Cosine`, `Step` and `Piecewise`, and common learning rate decay modules such as `L1Decay` and `L2Decay`.
Modules without built-in can be added through the following steps, take `optimizer` as an example:

1. Create your own optimizer in the [ppocr/optimizer/optimizer.py](../../ppocr/optimizer/optimizer.py) file, the sample code is as follows:

```python
from paddle import optimizer as optim


class MyOptim(object):
    def __init__(self, learning_rate=0.001, *args, **kwargs):
        self.learning_rate = learning_rate

    def __call__(self, parameters):
        # It is recommended to wrap the built-in optimizer of paddle
        opt = optim.XXX(
            learning_rate=self.learning_rate,
            parameters=parameters)
        return opt

```

After the optimizer module is added, you only need to configure it in the configuration file to use, such as:

```yaml
Optimizer:
  name: MyOptim
  args1: args1
  args2: args2
  lr:
    name: Cosine
    learning_rate: 0.001
  regularizer:
    name: 'L2'
    factor: 0
```
add doc of how to add new algorithm 2020-12-08 20:23:05 +08:00			`# Add new algorithm`

			`PaddleOCR decomposes an algorithm into the following parts, and modularizes each part to make it more convenient to develop new algorithms.`

			`* Data loading and processing`
			`* Network`
			`* Post-processing`
			`* Loss`
			`* Metric`
			`* Optimizer`

			`The following will introduce each part separately, and introduce how to add the modules required for the new algorithm.`


			`## Data loading and processing`

			`Data loading and processing are composed of different modules, which complete the image reading, data augment and label production. This part is under [ppocr/data](../../ppocr/data). The explanation of each file and folder are as follows:`

			```bash
			`ppocr/data/`
			`├── imaug # Scripts for image reading, data augment and label production`
			`│ ├── label_ops.py # Modules that transform the label`
			`│ ├── operators.py # Modules that transform the image`
			`│ ├──.....`
			`├── __init__.py`
			`├── lmdb_dataset.py # The dataset that reads the lmdb`
			└── simple_dataset.py # Read the dataset saved in the form of `image_path\tgt`
			```

			`PaddleOCR has a large number of built-in image operation related modules. For modules that are not built-in, you can add them through the following steps:`

			`1. Create a new file under the [ppocr/data/imaug](../../ppocr/data/imaug) folder, such as my_module.py.`
			`2. Add code in the my_module.py file, the sample code is as follows:`

			```python
			`class MyModule:`
			`def __init__(self, args, *kwargs):`
			`# your init code`
			`pass`

			`def __call__(self, data):`
			`img = data['image']`
			`label = data['label']`
			`# your process code`

			`data['image'] = img`
			`data['label'] = label`
			`return data`
			```

			`3. Import the added module in the [ppocr/data/imaug/\__init\__.py](../../ppocr/data/imaug/__init__.py) file.`

			`All different modules of data processing are executed by sequence, combined and executed in the form of a list in the config file. Such as:`

			```yaml
			`# angle class data process`
			`transforms:`
			`- DecodeImage: # load image`
			`img_mode: BGR`
			`channel_first: False`
			`- MyModule:`
			`args1: args1`
			`args2: args2`
			`- KeepKeys:`
			`keep_keys: [ 'image', 'label' ] # dataloader will return list in this order`
			```

			`## Network`

			`The network part completes the construction of the network, and PaddleOCR divides the network into four parts, which are under [ppocr/modeling](../../ppocr/modeling). The data entering the network will pass through these four parts in sequence(transforms->backbones->`
			`necks->heads).`

			```bash
			`├── architectures # Code for building network`
			`├── transforms # Image Transformation Module`
			`├── backbones # Feature extraction module`
			`├── necks # Feature enhancement module`
			`└── heads # Output module`
			```

			`PaddleOCR has built-in commonly used modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For modules that do not have built-in, you can add them through the following steps, the four parts are added in the same steps, take backbones as an example:`

			`1. Create a new file under the [ppocr/modeling/backbones](../../ppocr/modeling/backbones) folder, such as my_backbone.py.`
			`2. Add code in the my_backbone.py file, the sample code is as follows:`

			```python
			`import paddle`
			`import paddle.nn as nn`
			`import paddle.nn.functional as F`


			`class MyBackbone(nn.Layer):`
			`def __init__(self, args, *kwargs):`
			`super(MyBackbone, self).__init__()`
			`# your init code`
			`self.conv = nn.xxxx`

			`def forward(self, inputs):`
fix typo error 2021-01-18 18:40:12 +08:00			`# your network forward`
add doc of how to add new algorithm 2020-12-08 20:23:05 +08:00			`y = self.conv(inputs)`
			`return y`
			```

			`3. Import the added module in the [ppocr/modeling/backbones/\__init\__.py](../../ppocr/modeling/backbones/__init__.py) file.`

			`After adding the four-part modules of the network, you only need to configure them in the configuration file to use, such as:`

			```yaml
			`Architecture:`
			`model_type: rec`
			`algorithm: CRNN`
			`Transform:`
			`name: MyTransform`
			`args1: args1`
			`args2: args2`
			`Backbone:`
			`name: MyBackbone`
			`args1: args1`
			`Neck:`
			`name: MyNeck`
			`args1: args1`
			`Head:`
			`name: MyHead`
			`args1: args1`
			```

			`## Post-processing`

Correction expression 2020-12-09 16:49:42 +08:00			`Post-processing realizes decoding network output to obtain text box or recognized text. This part is under [ppocr/postprocess](../../ppocr/postprocess).`
add doc of how to add new algorithm 2020-12-08 20:23:05 +08:00			`PaddleOCR has built-in post-processing modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For components that are not built-in, they can be added through the following steps:`

			`1. Create a new file under the [ppocr/postprocess](../../ppocr/postprocess) folder, such as my_postprocess.py.`
			`2. Add code in the my_postprocess.py file, the sample code is as follows:`

			```python
			`import paddle`


			`class MyPostProcess:`
			`def __init__(self, args, *kwargs):`
			`# your init code`
			`pass`

			`def __call__(self, preds, label=None, args, *kwargs):`
			`if isinstance(preds, paddle.Tensor):`
			`preds = preds.numpy()`
			`# you preds decode code`
			`preds = self.decode_preds(preds)`
			`if label is None:`
			`return preds`
			`# you label decode code`
			`label = self.decode_label(label)`
			`return preds, label`

			`def decode_preds(self, preds):`
			`# you preds decode code`
			`pass`

			`def decode_label(self, preds):`
			`# you label decode code`
			`pass`
			```

			`3. Import the added module in the [ppocr/postprocess/\__init\__.py](../../ppocr/postprocess/__init__.py) file.`

			`After the post-processing module is added, you only need to configure it in the configuration file to use, such as:`

			```yaml
			`PostProcess:`
			`name: MyPostProcess`
			`args1: args1`
			`args2: args2`
			```

			`## Loss`

			`The loss function is used to calculate the distance between the network output and the label. This part is under [ppocr/losses](../../ppocr/losses).`
			`PaddleOCR has built-in loss function modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For modules that do not have built-in modules, you can add them through the following steps:`

			`1. Create a new file in the [ppocr/losses](../../ppocr/losses) folder, such as my_loss.py.`
			`2. Add code in the my_loss.py file, the sample code is as follows:`

			```python
			`import paddle`
			`from paddle import nn`


			`class MyLoss(nn.Layer):`
			`def __init__(self, **kwargs):`
			`super(MyLoss, self).__init__()`
			`# you init code`
			`pass`

			`def __call__(self, predicts, batch):`
			`label = batch[1]`
			`# your loss code`
			`loss = self.loss(input=predicts, label=label)`
			`return {'loss': loss}`
			```

			`3. Import the added module in the [ppocr/losses/\__init\__.py](../../ppocr/losses/__init__.py) file.`

			`After the loss function module is added, you only need to configure it in the configuration file to use it, such as:`

			```yaml
			`Loss:`
			`name: MyLoss`
			`args1: args1`
			`args2: args2`
			```

			`## Metric`

			`Metric is used to calculate the performance of the network on the current batch. This part is under [ppocr/metrics](../../ppocr/metrics). PaddleOCR has built-in evaluation modules related to algorithms such as detection, classification and recognition. For modules that do not have built-in modules, you can add them through the following steps:`

			`1. Create a new file under the [ppocr/metrics](../../ppocr/metrics) folder, such as my_metric.py.`
			`2. Add code in the my_metric.py file, the sample code is as follows:`

			```python

			`class MyMetric(object):`
			`def __init__(self, main_indicator='acc', **kwargs):`
			`# main_indicator is used for select best model`
			`self.main_indicator = main_indicator`
			`self.reset()`

			`def __call__(self, preds, batch, args, *kwargs):`
			`# preds is out of postprocess`
			`# batch is out of dataloader`
			`labels = batch[1]`
			`cur_correct_num = 0`
			`cur_all_num = 0`
			`# you metric code`
			`self.correct_num += cur_correct_num`
			`self.all_num += cur_all_num`
			`return {'acc': cur_correct_num / cur_all_num, }`

			`def get_metric(self):`
			`"""`
			`return metircs {`
			`'acc': 0,`
			`'norm_edit_dis': 0,`
			`}`
			`"""`
			`acc = self.correct_num / self.all_num`
			`self.reset()`
			`return {'acc': acc}`

			`def reset(self):`
			`# reset metric`
			`self.correct_num = 0`
			`self.all_num = 0`

			```

			`3. Import the added module in the [ppocr/metrics/\__init\__.py](../../ppocr/metrics/__init__.py) file.`

			`After the metric module is added, you only need to configure it in the configuration file to use it, such as:`

			```yaml
			`Metric:`
			`name: MyMetric`
			`main_indicator: acc`
			```

			`## 优化器`

			`The optimizer is used to train the network. The optimizer also contains network regularization and learning rate decay modules. This part is under [ppocr/optimizer](../../ppocr/optimizer). PaddleOCR has built-in`
			Commonly used optimizer modules such as `Momentum`, `Adam` and `RMSProp`, common regularization modules such as `Linear`, `Cosine`, `Step` and `Piecewise`, and common learning rate decay modules such as `L1Decay` and `L2Decay`.
			Modules without built-in can be added through the following steps, take `optimizer` as an example:

			`1. Create your own optimizer in the [ppocr/optimizer/optimizer.py](../../ppocr/optimizer/optimizer.py) file, the sample code is as follows:`

			```python
			`from paddle import optimizer as optim`


			`class MyOptim(object):`
			`def __init__(self, learning_rate=0.001, args, *kwargs):`
			`self.learning_rate = learning_rate`

			`def __call__(self, parameters):`
			`# It is recommended to wrap the built-in optimizer of paddle`
			`opt = optim.XXX(`
			`learning_rate=self.learning_rate,`
			`parameters=parameters)`
			`return opt`

			```

			`After the optimizer module is added, you only need to configure it in the configuration file to use, such as:`

			```yaml
			`Optimizer:`
			`name: MyOptim`
			`args1: args1`
			`args2: args2`
			`lr:`
			`name: Cosine`
			`learning_rate: 0.001`
			`regularizer:`
			`name: 'L2'`
			`factor: 0`
fix typo error 2021-01-18 18:40:12 +08:00			```