345 lines
7.5 KiB
ReStructuredText
345 lines
7.5 KiB
ReStructuredText
Example
|
||
=======
|
||
|
||
Standard NER
|
||
------------
|
||
The standard module is implemented by the pretrained model BERT.
|
||
|
||
**Step 1**
|
||
|
||
Enter ``DeepKE/example/ner/standard`` .
|
||
|
||
**Step 2**
|
||
|
||
Get data:
|
||
|
||
`wget 120.27.214.45/Data/ner/standard/data.tar.gz`
|
||
|
||
`tar -xzvf data.tar.gz`
|
||
|
||
The dataset and parameters can be customized in the ``data`` folder and ``conf`` folder respectively.
|
||
|
||
Dataset needs to be input as ``TXT`` file
|
||
|
||
The `data's format` of file needs to comply with the following:
|
||
|
||
杭 B-LOC '\\n'
|
||
州 I-LOC '\\n'
|
||
真 O '\\n'
|
||
美 O '\\n'
|
||
|
||
**Step 3**
|
||
|
||
Train:
|
||
|
||
`python run.py`
|
||
|
||
**Step 4**
|
||
|
||
Predict:
|
||
|
||
`python predict.py`
|
||
|
||
.. code-block:: bash
|
||
|
||
cd example/ner/standard
|
||
|
||
wget 120.27.214.45/Data/ner/standard/data.tar.gz
|
||
|
||
tar -xzvf data.tar.gz
|
||
|
||
python run.py
|
||
|
||
python predict.py
|
||
|
||
Few-shot NER
|
||
------------
|
||
This module is in the low-resouce scenario.
|
||
|
||
**Step 1**
|
||
|
||
Enter ``DeepKE/example/ner/few-shot`` .
|
||
|
||
**Step 2**
|
||
|
||
Get data:
|
||
|
||
`wget 120.27.214.45/Data/ner/few_shot/data.tar.gz`
|
||
|
||
`tar -xzvf data.tar.gz`
|
||
|
||
The directory where the model is loaded and saved and the configuration parameters can be cusomized in the ``conf`` folder.The dataset can be customized in the ``data`` folder.
|
||
|
||
Dataset needs to be input as ``TXT`` file
|
||
|
||
The `data's format` of file needs to comply with the following:
|
||
|
||
EU B-ORG '\\n'
|
||
rejects O '\\n'
|
||
German B-MISC '\\n'
|
||
call O '\\n'
|
||
to O '\\n'
|
||
boycott O '\\n'
|
||
British B-MISC '\\n'
|
||
lamb O '\\n'
|
||
. O '\\n'
|
||
|
||
**Step 3**
|
||
|
||
Train with CoNLL-2003:
|
||
|
||
`python run.py`
|
||
|
||
Train in the few-shot scenario:
|
||
|
||
`python run.py +train=few_shot`. Users can modify `load_path` in ``conf/train/few_shot.yaml`` with the use of existing loaded model.
|
||
|
||
**Step 4**
|
||
|
||
Predict:
|
||
|
||
add `- predict` to ``conf/config.yaml`` , modify `loda_path` as the model path and `write_path` as the path where the predicted results are saved in ``conf/predict.yaml`` , and then run `python predict.py`
|
||
|
||
.. code-block:: bash
|
||
|
||
cd example/ner/few-shot
|
||
|
||
wget 120.27.214.45/Data/ner/few_shot/data.tar.gz
|
||
|
||
tar -xzvf data.tar.gz
|
||
|
||
python run.py
|
||
|
||
python predict.py
|
||
|
||
Standard RE
|
||
-----------
|
||
The standard module is implemented by common deep learning models, including CNN, RNN, Capsule, GCN, Transformer and the pretrained model.
|
||
|
||
**Step 1**
|
||
|
||
Enter the ``DeepKE/example/re/standard`` folder.
|
||
|
||
**Step 2**
|
||
|
||
Get data:
|
||
|
||
`wget 120.27.214.45/Data/re/standard/data.tar.gz`
|
||
|
||
`tar -xzvf data.tar.gz`
|
||
|
||
The dataset and parameters can be customized in the ``data`` folder and ``conf`` folder respectively.
|
||
|
||
Dataset needs to be input as ``CSV`` file.
|
||
|
||
The `data's format` of file needs to comply with the following:
|
||
|
||
+--------------------------+-----------+------------+-------------+------------+------------+
|
||
| Sentence | Relation | Head | Head_offset | Tail | Tail_offset|
|
||
+--------------------------+-----------+------------+-------------+------------+------------+
|
||
|
||
The relation's format of file needs to comply with the following:
|
||
|
||
+------------+-----------+------------------+-------------+
|
||
| Head_type | Tail_type | relation | Index |
|
||
+------------+-----------+------------------+-------------+
|
||
|
||
|
||
**Step 3**
|
||
|
||
Train:
|
||
|
||
`python run.py`
|
||
|
||
**Step 4**
|
||
|
||
Predict:
|
||
|
||
`python predict.py`
|
||
|
||
.. code-block:: bash
|
||
|
||
cd example/re/standard
|
||
|
||
wget 120.27.214.45/Data/re/standard/data.tar.gz
|
||
|
||
tar -xzvf data.tar.gz
|
||
|
||
python run.py
|
||
|
||
python predict.py
|
||
|
||
Few-shot RE
|
||
-----------
|
||
This module is in the low-resouce scenario.
|
||
|
||
**Step 1**
|
||
|
||
Enter ``DeepKE/example/re/few-shot`` .
|
||
|
||
**Step 2**
|
||
|
||
Get data:
|
||
|
||
`wget 120.27.214.45/Data/re/few_shot/data.tar.gz`
|
||
|
||
`tar -xzvf data.tar.gz`
|
||
|
||
The dataset and parameters can be customized in the ``data`` folder and ``conf`` folder respectively.
|
||
|
||
Dataset needs to be input as ``TXT`` file and ``JSON`` file.
|
||
|
||
The `data's format` of file needs to comply with the following:
|
||
|
||
{"token": ["the", "most", "common", "audits", "were", "about", "waste", "and", "recycling", "."], "h": {"name": "audits", "pos": [3, 4]}, "t": {"name": "waste", "pos": [6, 7]}, "relation": "Message-Topic(e1,e2)"}
|
||
|
||
The relation's format of file needs to comply with the following:
|
||
|
||
{"Other": 0 , "Message-Topic(e1,e2)": 1 ... }
|
||
|
||
**Step 3**
|
||
|
||
Train:
|
||
|
||
`python run.py`
|
||
|
||
Start with the model trained last time: modify `train_from_saved_model` in ``conf/train.yaml`` as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by ``log_dir``.
|
||
|
||
**Step 4**
|
||
|
||
Predict:
|
||
|
||
`python predict.py`
|
||
|
||
.. code-block:: bash
|
||
|
||
cd example/re/few-shot
|
||
|
||
wget 120.27.214.45/Data/re/few_shot/data.tar.gz
|
||
|
||
tar -xzvf data.tar.gz
|
||
|
||
python run.py
|
||
|
||
python predict.py
|
||
|
||
Document RE
|
||
-----------
|
||
This module is in the document scenario.
|
||
|
||
**Step 1**
|
||
|
||
Enter ``DeepKE/example/re/document`` .
|
||
|
||
**Step2**
|
||
|
||
Get data:
|
||
|
||
`wget 120.27.214.45/Data/re/document/data.tar.gz`
|
||
|
||
`tar -xzvf data.tar.gz`
|
||
|
||
The dataset and parameters can be customized in the ``data`` folder and ``conf`` folder respectively.
|
||
|
||
|
||
Dataset needs to be input as ``JSON`` file
|
||
|
||
The `data's format` of file needs to comply with the following:
|
||
|
||
[{"vertexSet": [[{"name": "Lark Force", "pos": [0, 2], "sent_id": 0, "type": "ORG"},...]],
|
||
|
||
"labels": [{"r": "P607", "h": 1, "t": 3, "evidence": [0]}, ...],
|
||
|
||
"title": "Lark Force",
|
||
|
||
"sents": [["Lark", "Force", "was", "an", "Australian", "Army", "formation", "established", "in", "March", "1941", "during", "World", "War", "II", "for", "service", "in", "New", "Britain", "and", "New", "Ireland", "."],...}]
|
||
|
||
|
||
The relation's format of file needs to comply with the following:
|
||
|
||
{"P1376": 79,"P607": 27,...}
|
||
|
||
**Step 3**
|
||
|
||
Train:
|
||
|
||
`python run.py`
|
||
|
||
Start with the model trained last time: modify `train_from_saved_model` in ``conf/train.yaml`` as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by ``log_dir``.
|
||
|
||
**Step 4**
|
||
|
||
Predict:
|
||
|
||
`python predict.py`
|
||
|
||
.. code-block:: bash
|
||
|
||
cd example/re/document
|
||
|
||
wget 120.27.214.45/Data/re/document/data.tar.gz
|
||
|
||
tar -xzvf data.tar.gz
|
||
|
||
python run.py
|
||
|
||
python predict.py
|
||
|
||
Standard AE
|
||
-----------
|
||
The standard module is implemented by common deep learning models, including CNN, RNN, Capsule, GCN, Transformer and the pretrained model.
|
||
|
||
**Step 1**
|
||
|
||
Enter the ``DeepKE/example/ae/standard`` folder.
|
||
|
||
**Step 2**
|
||
|
||
Get data:
|
||
|
||
`wget 120.27.214.45/Data/ae/standard/data.tar.gz`
|
||
|
||
`tar -xzvf data.tar.gz`
|
||
|
||
The dataset and parameters can be customized in the ``data`` folder and ``conf`` folder respectively.
|
||
|
||
Dataset needs to be input as ``CSV`` file.
|
||
|
||
The `data's format` of file needs to comply with the following:
|
||
|
||
+--------------------------+------------+------------+---------------+-------------------+-----------------------+
|
||
| Sentence | Attribute | Entity | Entity_offset | Attribute_value | Attribute_value_offset|
|
||
+--------------------------+------------+------------+---------------+-------------------+-----------------------+
|
||
|
||
The attribute's format of file needs to comply with the following:
|
||
|
||
+-------------------+-------------+
|
||
| Attribute | Index |
|
||
+-------------------+-------------+
|
||
|
||
**Step 3**
|
||
|
||
Train:
|
||
|
||
`python run.py`
|
||
|
||
**Step 4**
|
||
|
||
Predict:
|
||
|
||
`python predict.py`
|
||
|
||
.. code-block:: bash
|
||
|
||
cd example/ae/regular
|
||
|
||
wget 120.27.214.45/Data/ae/standard/data.tar.gz
|
||
|
||
tar -xzvf data.tar.gz
|
||
|
||
python run.py
|
||
|
||
python predict.py
|
||
|
||
|
||
More details , you can refer to https://www.bilibili.com/video/BV1n44y1x7iW?spm_id_from=333.999.0.0 . |