Example¶

Standard NER¶

The standard module is implemented by the pretrained model BERT.

Step 1

Enter DeepKE/example/ner/standard .

Step 2

Get data:

wget 120.27.214.45/Data/ner/standard/data.tar.gz

tar -xzvf data.tar.gz

The dataset and parameters can be customized in the data folder and conf folder respectively.

Dataset needs to be input as TXT file

The data’s format of file needs to comply with the following：

杭 B-LOC ‘\n’ 州 I-LOC ‘\n’ 真 O ‘\n’ 美 O ‘\n’

Step 3

Train:

python run.py

Step 4

Predict:

python predict.py

cd example/ner/standard

wget 120.27.214.45/Data/ner/standard/data.tar.gz

tar -xzvf data.tar.gz

python run.py

python predict.py

Few-shot NER¶

This module is in the low-resouce scenario.

Step 1

Enter DeepKE/example/ner/few-shot .

Step 2

Get data:

wget 120.27.214.45/Data/ner/few_shot/data.tar.gz

tar -xzvf data.tar.gz

The directory where the model is loaded and saved and the configuration parameters can be cusomized in the conf folder.The dataset can be customized in the data folder.

Dataset needs to be input as TXT file

The data’s format of file needs to comply with the following：

EU B-ORG ‘\n’ rejects O ‘\n’ German B-MISC ‘\n’ call O ‘\n’ to O ‘\n’ boycott O ‘\n’ British B-MISC ‘\n’ lamb O ‘\n’ . O ‘\n’

Step 3

Train with CoNLL-2003:

python run.py

Train in the few-shot scenario:

python run.py +train=few_shot. Users can modify load_path in conf/train/few_shot.yaml with the use of existing loaded model.

Step 4

Predict:

add - predict to conf/config.yaml , modify loda_path as the model path and write_path as the path where the predicted results are saved in conf/predict.yaml , and then run python predict.py

cd example/ner/few-shot

wget 120.27.214.45/Data/ner/few_shot/data.tar.gz

tar -xzvf data.tar.gz

python run.py

python predict.py

Standard RE¶

The standard module is implemented by common deep learning models, including CNN, RNN, Capsule, GCN, Transformer and the pretrained model.

Step 1

Enter the DeepKE/example/re/standard folder.

Step 2

Get data:

wget 120.27.214.45/Data/re/standard/data.tar.gz

tar -xzvf data.tar.gz

The dataset and parameters can be customized in the data folder and conf folder respectively.

Dataset needs to be input as CSV file.

The data’s format of file needs to comply with the following：

Sentence

Relation

Head

Head_offset

Tail

Tail_offset

The relation’s format of file needs to comply with the following：

Head_type

Tail_type

relation

Index

Step 3

Train:

python run.py

Step 4

Predict:

python predict.py

cd example/re/standard

wget 120.27.214.45/Data/re/standard/data.tar.gz

tar -xzvf data.tar.gz

python run.py

python predict.py

Few-shot RE¶

This module is in the low-resouce scenario.

Step 1

Enter DeepKE/example/re/few-shot .

Step 2

Get data:

wget 120.27.214.45/Data/re/few_shot/data.tar.gz

tar -xzvf data.tar.gz

The dataset and parameters can be customized in the data folder and conf folder respectively.

Dataset needs to be input as TXT file and JSON file.

The data’s format of file needs to comply with the following：

{“token”: [“the”, “most”, “common”, “audits”, “were”, “about”, “waste”, “and”, “recycling”, “.”], “h”: {“name”: “audits”, “pos”: [3, 4]}, “t”: {“name”: “waste”, “pos”: [6, 7]}, “relation”: “Message-Topic(e1,e2)”}

The relation’s format of file needs to comply with the following：

{“Other”: 0 , “Message-Topic(e1,e2)”: 1 … }

Step 3

Train:

python run.py

Start with the model trained last time: modify train_from_saved_model in conf/train.yaml as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by log_dir.

Step 4

Predict:

python predict.py

cd example/re/few-shot

wget 120.27.214.45/Data/re/few_shot/data.tar.gz

tar -xzvf data.tar.gz

python run.py

python predict.py

Document RE¶

This module is in the document scenario.

Step 1

Enter DeepKE/example/re/document .

Step2

Get data:

wget 120.27.214.45/Data/re/document/data.tar.gz

tar -xzvf data.tar.gz

The dataset and parameters can be customized in the data folder and conf folder respectively.

Dataset needs to be input as JSON file

The data’s format of file needs to comply with the following：

[{“vertexSet”: [[{“name”: “Lark Force”, “pos”: [0, 2], “sent_id”: 0, “type”: “ORG”},…]],

“labels”: [{“r”: “P607”, “h”: 1, “t”: 3, “evidence”: [0]}, …],

“title”: “Lark Force”,

“sents”: [[“Lark”, “Force”, “was”, “an”, “Australian”, “Army”, “formation”, “established”, “in”, “March”, “1941”, “during”, “World”, “War”, “II”, “for”, “service”, “in”, “New”, “Britain”, “and”, “New”, “Ireland”, “.”],…}]

The relation’s format of file needs to comply with the following：

{“P1376”: 79,”P607”: 27,…}

Step 3

Train:

python run.py

Start with the model trained last time: modify train_from_saved_model in conf/train.yaml as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by log_dir.

Step 4

Predict:

python predict.py

cd example/re/document

wget 120.27.214.45/Data/re/document/data.tar.gz

tar -xzvf data.tar.gz

python run.py

python predict.py

Standard AE¶

The standard module is implemented by common deep learning models, including CNN, RNN, Capsule, GCN, Transformer and the pretrained model.

Step 1

Enter the DeepKE/example/ae/standard folder.

Step 2

Get data:

wget 120.27.214.45/Data/ae/standard/data.tar.gz

tar -xzvf data.tar.gz

The dataset and parameters can be customized in the data folder and conf folder respectively.

Dataset needs to be input as CSV file.

The data’s format of file needs to comply with the following：

Sentence

Attribute

Entity

Entity_offset

Attribute_value

Attribute_value_offset

The attribute’s format of file needs to comply with the following：

Attribute

Index

Step 3

Train:

python run.py

Step 4

Predict:

python predict.py

cd example/ae/regular

wget 120.27.214.45/Data/ae/standard/data.tar.gz

tar -xzvf data.tar.gz

python run.py

python predict.py

More details , you can refer to https://www.bilibili.com/video/BV1n44y1x7iW?spm_id_from=333.999.0.0 .