Update README

This commit is contained in:
xxupiano 2022-01-11 17:01:47 +08:00
parent efde9b0d92
commit ea6321b97f
14 changed files with 767 additions and 278 deletions

View File

@ -98,6 +98,7 @@ conda activate deepke
pip install deepke
```
**Step3** Enter the task directory
```bash
@ -377,6 +378,7 @@ This toolkit provides many `Jupyter Notebook` and `Google Colab` tutorials. User
[RE Colab](https://colab.research.google.com/drive/1RGUBbbOBHlWJ1NXQLtP_YEUktntHtROa?usp=sharing)
<br>
# Tips

View File

@ -25,6 +25,11 @@ DeepKE 是一个支持<b>低资源、长篇章</b>的知识抽取工具,可以
<br>
# 新版特性
## 2021年1月
- 发布论文[DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population](https://arxiv.org/abs/2201.03335)
## 2021年12月
- 加入`dockerfile`以便自动创建环境
## 2021年11月

View File

@ -1,58 +1,73 @@
## 快速上手
### 环境依赖
> python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
### 克隆代码
```
git clone git@github.com:zjunlp/DeepKE.git
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/ae/standard/data.tar.gz```至此目录下
解压后`data/origin` 文件夹下存放来训练数据。训练文件主要有三个文件。
- `train.csv`:存放训练数据集
- `valid.csv`:存放验证数据集
- `test.csv`:存放测试数据集
- `attribute.csv`:存放属性种类
- 开始训练:```python run.py``` (训练所用到参数都在conf文件夹中修改即可使用lm时可修改'lm_file'使用下载至本地的模型)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存在 `checkpoints` 文件夹内。
- 进行预测 ```python predict.py```
## 模型内容
1、CNN
2、RNN
3、Capsule
4、GCN
5、Transformer
6、预训练模型
# Easy Start
<p align="left">
<b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/example/ae/standard/README_CN.md">简体中文</a> </b>
</p>
## Requirements
> python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
## Download Code
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/ae/standard
```
## Install with Pip
- Create and enter the python virtual environment.
- Install dependencies: `pip install -r requirements.txt`.
## Train and Predict
- Dataset
- Download the dataset to this directory.
```bash
wget 120.27.214.45/Data/ae/standard/data.tar.gz
tar -xzvf data.tar.gz
```
- The dataset is stored in `data/origin`:
- `train.csv`: Training set
- `valid.csv `: Validation set
- `test.csv`: Test set
- `attribute.csv`: Attribute types
- Training
- Parameters for training are in the `conf` folder and users can modify them before training.
- If using LM, modify `lm_file` to use the local model.
- Logs for training are in the `log` folder and the trained model is saved in the `checkpoints` folder.
```bash
python run.py
```
- Prediction
```bash
python predict.py
```
## Models
1. CNN
2. RNN
3. Capsule
4. GCN
5. Transformer
6. Pre-trained Model (BERT)

View File

@ -0,0 +1,63 @@
## 快速上手
<p align="left">
<b> <a href="https://github.com/zjunlp/DeepKE/blob/main/example/ae/standard/README.md">English</a> | 简体中文 </b>
</p>
### 环境依赖
> python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
### 克隆代码
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/ae/standard
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/ae/standard/data.tar.gz```至此目录下
解压后`data/origin` 文件夹下存放来训练数据:
- `train.csv`:存放训练数据集
- `valid.csv`:存放验证数据集
- `test.csv`:存放测试数据集
- `attribute.csv`:存放属性种类
- 开始训练:```python run.py``` (训练所用到参数都在conf文件夹中修改即可使用LM时可修改'lm_file'使用下载至本地的模型)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存在 `checkpoints` 文件夹内。
- 进行预测 ```python predict.py```
## 模型内容
1、CNN
2、RNN
3、Capsule
4、GCN
5、Transformer
6、预训练模型

View File

@ -1,56 +1,83 @@
## 快速上手
### 环境依赖
> python == 3.8
- torch == 1.5
- transformers == 3.4.0
- deepke
### 克隆代码
```
git clone git@github.com:zjunlp/DeepKE.git
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/ner/few_shot/data.tar.gz```在此目录下
`data` 文件夹下存放训练数据。包含conll2003mit-moviemit-restaurant和atis等数据集。
- conll2003包含以下数据
- `train.txt`:存放训练数据集
- `dev.txt`:存放验证数据集
- `test.txt`:存放测试数据集
- `indomain-train.txt`存放indomain数据集
- mit-movie, mit-restaurant和atis包含以下数据
- `k-shot-train.txt`k=[10, 20, 50, 100, 200, 500],存放训练数据集
- `test.txt`:存放测试数据集
- 开始训练模型加载和保存位置以及配置可以在conf文件夹中修改
- 训练conll2003` python run.py ` (训练所用到参数都在conf文件夹中修改即可)
- 进行few-shot训练` python run.py +train=few_shot ` (若要加载模型修改few_shot.yaml中的load_path)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存目录可以自定义。
- 进行预测在config.yaml中加入 - predict 再在predict.yaml中修改load_path为模型路径以及write_path为预测结果保存路径再` python predict.py `
# Easy Start
<p align="left">
<b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/example/ner/few-shot/README_CN.md">简体中文</a> </b>
</p>
## Requirements
> python == 3.8
- torch == 1.5
- transformers == 3.4.0
- deepke
## Download Code
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/ner/few-shot
```
## Install with Pip
- Create and enter the python virtual environment.
- Install dependencies: `pip install -r requirements.txt`.
## Train and Predict
- Dataset
- Download the dataset to this directory.
```bash
wget 120.27.214.45/Data/ner/few-shot/data.tar.gz
tar -xzvf data.tar.gz
```
- The datasets are stored in `data`, including CoNLL-2003, MIT-movie, MIT-restaurant and ATIS.
- **CoNLL-2003**
- `train.txt`: Training set
- `valid.txt `: Validation set
- `test.txt`: Test set
- `indomain-train.txt`: In-domain training set
- **MIT-movie, MIT-restaurant and ATIS**
- `k-shot-train.txt`: k=[10, 20, 50, 100, 200, 500], Training set
- `test.txt`: Testing set
- Training
- Parameters, model paths and configuration for training are in the `conf` folder and users can modify them before training.
- Training on CoNLL-2003
```bash
python run.py
```
- Few-shot Training
If the model need to be uploaded, modify `load_path` in `few_shot.yaml`
```bash
python run.py +train=few_shot
```
- Logs for training are in the `log` folder. The path of the trained model can be customized.
- Prediction
- Add `- predict` in `config.yaml`
- Modify `load_path` as the path of the trained model and `write_path` as the path of predicted results in `predict.yaml`
- ```bash
python predict.py
```
## Model
[LightNER](https://arxiv.org/abs/2109.00720)

View File

@ -0,0 +1,64 @@
## 快速上手
<p align="left">
<b> <a href="https://github.com/zjunlp/DeepKE/blob/main/example/ner/few-shot/README.md">English</a> | 简体中文 </b>
</p>
### 环境依赖
> python == 3.8
- torch == 1.5
- transformers == 3.4.0
- deepke
### 克隆代码
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/ner/few-shot
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/ner/few_shot/data.tar.gz```在此目录下
`data` 文件夹下存放训练数据。包含CoNLL2003MIT-movie, MIT-restaurant和ATIS等数据集。
- conll2003包含以下数据
- `train.txt`:存放训练数据集
- `dev.txt`:存放验证数据集
- `test.txt`:存放测试数据集
- `indomain-train.txt`存放in-domain数据集
- MIT-movie, MIT-restaurant和ATIS包含以下数据
- `k-shot-train.txt`k=[10, 20, 50, 100, 200, 500],存放训练数据集
- `test.txt`:存放测试数据集
- 开始训练模型加载和保存位置以及配置可以在conf文件夹中修改
- 训练conll2003` python run.py ` (训练所用到参数都在conf文件夹中修改即可)
- 进行few-shot训练` python run.py +train=few_shot ` (若要加载模型修改few_shot.yaml中的load_path)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存目录可以自定义。
- 进行预测在config.yaml中加入 - predict 再在predict.yaml中修改load_path为模型路径以及write_path为预测结果保存路径再` python predict.py `
### 模型
[LightNER](https://arxiv.org/abs/2109.00720)

View File

@ -1,52 +1,65 @@
## 快速上手
### 环境依赖
> python == 3.8
- pytorch-transformers == 1.2.0
- torch == 1.5.0
- hydra-core == 1.0.6
- seqeval == 1.2.2
- tqdm == 4.60.0
- matplotlib == 3.4.1
- deepke
### 克隆代码
```
git clone git@github.com:zjunlp/DeepKE.git
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖:`pip install -r requirements.txt`
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/ner/standard/data.tar.gz```在此目录下
在`data`文件夹下存放数据。主要有三个文件:
- `train.txt`:存放训练数据集
- `valid.txt`:存放验证数据集
- `test.txt`:存放测试数据集
- 开始训练:```python run.py``` (训练所用到参数都在conf文件夹中修改即可)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存在 `checkpoints` 文件夹内。
- 进行预测 ```python predict.py```
### 模型内容
BERT
# Easy Start
<p align="left">
<b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/example/ner/standard/README_CN.md">简体中文</a> </b>
</p>
## Requirements
> python == 3.8
- pytorch-transformers == 1.2.0
- torch == 1.5.0
- hydra-core == 1.0.6
- seqeval == 1.2.2
- tqdm == 4.60.0
- matplotlib == 3.4.1
- deepke
## Download Code
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/ner/standard
```
## Install with Pip
- Create and enter the python virtual environment.
- Install dependencies: `pip install -r requirements.txt`.
## Train and Predict
- Dataset
- Download the dataset to this directory.
```bash
wget 120.27.214.45/Data/ner/standard/data.tar.gz
tar -xzvf data.tar.gz
```
- The dataset is stored in `data`
- `train.txt`: Training set
- `valid.txt `: Validation set
- `test.txt`: Test set
- Training
- Parameters for training are in the `conf` folder and users can modify them before training.
- Logs for training are in the `log` folder and the trained model is saved in the `checkpoints` folder.
```bash
python run.py
```
- Prediction
```bash
python predict.py
```
## Model
BERT

View File

@ -0,0 +1,57 @@
## 快速上手
<p align="left">
<b> <a href="https://github.com/zjunlp/DeepKE/blob/main/example/ner/standard/README.md">English</a> | 简体中文 </b>
</p>
### 环境依赖
> python == 3.8
- pytorch-transformers == 1.2.0
- torch == 1.5.0
- hydra-core == 1.0.6
- seqeval == 1.2.2
- tqdm == 4.60.0
- matplotlib == 3.4.1
- deepke
### 克隆代码
```
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/ner/standard
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖:`pip install -r requirements.txt`
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/ner/standard/data.tar.gz```在此目录下
在`data`文件夹下存放数据:
- `train.txt`:存放训练数据集
- `valid.txt`:存放验证数据集
- `test.txt`:存放测试数据集
- 开始训练:```python run.py``` (训练所用到参数都在conf文件夹中修改即可)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存在 `checkpoints` 文件夹内。
- 进行预测 ```python predict.py```
### 模型内容
BERT

View File

@ -1,6 +1,10 @@
## 快速上手
# Easy Start
### 环境依赖
<p align="left">
<b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/example/re/document/README_CN.md">简体中文</a> </b>
</p>
## Requirements
> python == 3.8
@ -10,51 +14,68 @@
- ujson
- deepke
### 克隆代码
## Download Code
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/document
```
git clone git@github.com:zjunlp/DeepKE.git
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
## Install with Pip
- 安装依赖: ```pip install -r requirements.txt```
- Create and enter the python virtual environment.
- Install dependencies: `pip install -r requirements.txt`.
### 使用数据进行训练预测
## Train and Predict
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/re/document/data.tar.gz```在此目录下
- Dataset
`data` 文件夹下存放训练数据。模型采用的数据集是[DocRED](https://github.com/thunlp/DocRED/tree/master/)DocRED数据集来自于2010年的国际语义评测大会中Task 8"Multi-Way Classification of Semantic Relations Between Pairs of Nominals"。
- Download the dataset to this directory.
```bash
wget 120.27.214.45/Data/re/document/data.tar.gz
tar -xzvf data.tar.gz
```
- DocRED包含以下数据
- The dataset [DocRED](https://github.com/thunlp/DocRED/tree/master/) is stored in `data`:
- `dev.json`:验证集
- `dev.json`Validation set
- `rel_info.json`Relation set
- `rel_info.json`:关系集
- `rel2id.json`Relation labels - ID
- `rel2id.json`关系标签到ID的映射
- `test.json`Test set
- `test.json`:测试集
- `train_annotated.json`Training set annotated manually
- `train_annotated.json`:训练集
- `train_distant.json`: Training set generated by distant supervision
- `train_distant.json`
- Training
- 开始训练模型加载和保存位置以及配置可以在conf的`.yaml`文件中修改
- 在数据集DocRED中训练`python run.py`
- Parameters, model paths and configuration for training are in the `conf` folder and users can modify them before training.
- 训练好的模型保存在根目录下
- Training on DocRED
- 从上次训练的模型开始训练:设置`.yaml`中的train_from_saved_model为上次保存模型的路径
```bash
python run.py
```
- 每次训练的日志保存路径默认保存在根目录,可以通过`.yaml`中的log_dir来配置
- The trained model is stored in the current directory by default.
- 进行预测: `python predict.py`
- Start to train from last-trained model<br>
- 预测生成的`result.json`保存在根目录
modify `train_from_saved_model` in `.yaml` as the path of the last-trained model
- Logs for training are stored in the current directory by default and the path can be configured by modifying `log_dir` in `.yaml`
## 模型内容
DocuNet
- Prediction
```bash
python predict.py
```
- After prediction, generated `result.json` is stored in the current directory
## Model
[DocuNet](https://arxiv.org/abs/2106.03618)

View File

@ -0,0 +1,65 @@
## 快速上手
<p align="left">
<b> <a href="https://github.com/zjunlp/DeepKE/blob/main/example/re/document/README.md">English</a> | 简体中文 </b>
</p>
### 环境依赖
> python == 3.8
- torch == 1.5.0
- transformers == 3.4.0
- opt-einsum == 3.3.0
- ujson
- deepke
### 克隆代码
```
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/d
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/re/document/data.tar.gz```在此目录下
`data` 文件夹下存放训练数据。模型采用的数据集是[DocRED](https://github.com/thunlp/DocRED/tree/master/)DocRED数据集来自于2010年的国际语义评测大会中Task 8"Multi-Way Classification of Semantic Relations Between Pairs of Nominals"。
- DocRED包含以下数据
- `dev.json`:验证集
- `rel_info.json`:关系集
- `rel2id.json`关系标签到ID的映射
- `test.json`:测试集
- `train_annotated.json`:人工标注的训练集
- `train_distant.json`:远程监督产生的训练集
- 开始训练模型加载和保存位置以及配置可以在conf的`.yaml`文件中修改
- 在数据集DocRED中训练`python run.py`
- 训练好的模型保存在当前目录下
- 从上次训练的模型开始训练:设置`.yaml`中的train_from_saved_model为上次保存模型的路径
- 每次训练的日志保存路径默认保存在根目录,可以通过`.yaml`中的log_dir来配置
- 进行预测: `python predict.py`
- 预测生成的`result.json`保存在根目录
## 模型内容
[DocuNet](https://arxiv.org/abs/2106.03618)

View File

@ -1,6 +1,10 @@
## 快速上手
# Easy Start
### 环境依赖
<p align="left">
<b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/example/re/rew-shot/README_CN.md">简体中文</a> </b>
</p>
## Requirements
> python == 3.8
@ -9,46 +13,63 @@
- hydra-core == 1.0.6
- deepke
### 克隆代码
## Download Code
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/few-shot
```
git clone git@github.com:zjunlp/DeepKE.git
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
## Install with Pip
- 安装依赖: ```pip install -r requirements.txt```
- Create and enter the python virtual environment.
- Install dependencies: `pip install -r requirements.txt`.
### 使用数据进行训练预测
## Train and Predict
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/re/few_shot/data.tar.gz```在此目录下
- Dataset
`data` 文件夹下存放训练数据。模型采用的数据集是[SEMEVAL](https://semeval2.fbk.eu/semeval2.php?location=tasks#T11)SEMEVAL数据集来自于2010年的国际语义评测大会中Task 8"Multi-Way Classification of Semantic Relations Between Pairs of Nominals"。
- Download the dataset to this directory.
- SEMEVAL包含以下数据
```bash
wget 120.27.214.45/Data/re/few-shot/data.tar.gz
tar -xzvf data.tar.gz
```
- `rel2id.json`关系标签到ID的映射
- The dataset [SEMEVAL](https://semeval2.fbk.eu/semeval2.php?location=tasks#T11) is stored in `data`:
- `rel2id.json`Relation Label - ID
- `temp.txt`Results of handled relation labels
- `temp.txt`:关系标签处理
- `test.txt` Test set
- `test.txt` 测试集
- `train.txt`: Training set
- `train.txt`:训练集
- `val.txt`Validation set
- `val.txt`:验证集
- Training
- 开始训练模型加载和保存位置以及配置可以在conf的`.yaml`文件中修改
- 对数据集SEMEVAL进行few-shot训练`python run.py`
- Parameters, model paths and configuration for training are in the `conf` folder and users can modify them before training.
- 训练好的模型默认保存在根目录
- Few-shot training on SEMEVAL
- 从上次训练的模型开始训练:设置`.yaml`中的train_from_saved_model为上次保存模型的路径
```bash
python run.py
```
- 每次训练的日志保存路径默认保存在根目录,可以通过`.yaml`中的log_dir来配置
- The trained model is stored in the current directory by default.
- 进行预测: `python predict.py `
- Start to train from last-trained model<br>
modify `train_from_saved_model` in `.yaml` as the path of the last-trained model
## 模型内容
KnowPrompt
- Logs for training are stored in the current directory by default and the path can be configured by modifying `log_dir` in `.yaml`
- Prediction
```bash
python predict.py
```
## Model
[KnowPrompt](https://arxiv.org/abs/2104.07650)

View File

@ -0,0 +1,59 @@
## 快速上手
<p align="left">
<b> <a href="https://github.com/zjunlp/DeepKE/blob/main/example/re/few-shot/README.md">English</a> | 简体中文 </b>
</p>
### 环境依赖
> python == 3.8
- torch == 1.5
- transformers == 3.4.0
- hydra-core == 1.0.6
- deepke
### 克隆代码
```
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/few-shot
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/re/few_shot/data.tar.gz```在此目录下
`data` 文件夹下存放训练数据。模型采用的数据集是[SEMEVAL](https://semeval2.fbk.eu/semeval2.php?location=tasks#T11)SEMEVAL数据集来自于2010年的国际语义评测大会中Task 8"Multi-Way Classification of Semantic Relations Between Pairs of Nominals"。
- SEMEVAL包含以下数据
- `rel2id.json`关系标签到ID的映射
- `temp.txt`:关系标签处理
- `test.txt` 测试集
- `train.txt`:训练集
- `val.txt`:验证集
- 开始训练模型加载和保存位置以及配置可以在conf的`.yaml`文件中修改
- 对数据集SEMEVAL进行few-shot训练`python run.py`
- 训练好的模型默认保存在当前目录
- 从上次训练的模型开始训练:设置`.yaml`中的train_from_saved_model为上次保存模型的路径
- 每次训练的日志保存路径默认保存在当前目录,可以通过`.yaml`中的log_dir来配置
- 进行预测: `python predict.py `
## 模型内容
[KnowPrompt](https://arxiv.org/abs/2104.07650)

View File

@ -1,58 +1,72 @@
## 快速上手
### 环境依赖
> python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
### 克隆代码
```
git clone git@github.com:zjunlp/DeepKE.git
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/re/standard/data.tar.gz```在此目录下
`data/origin` 文件夹下存放训练数据。训练文件主要有四个文件。
- `train.csv`:存放训练数据集
- `valid.csv`:存放验证数据集
- `test.csv`:存放测试数据集
- `relation.csv`:存放关系种类
- 开始训练:```python run.py``` (训练所用到参数都在conf文件夹中修改即可使用lm时可修改'lm_file'使用下载至本地的模型)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存在 `checkpoints` 文件夹内。
- 进行预测 ```python predict.py```
## 模型内容
1、CNN
2、RNN
3、Capsule
4、GCN
5、Transformer
6、预训练模型
# Easy Start
<p align="left">
<b> English | <a href="https://github.com/zjunlp/DeepKE/blob/main/example/re/standard/README_CN.md">简体中文</a> </b>
</p>
## Requirements
> python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
## Download Code
```bash
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/standard
```
## Install with Pip
- Create and enter the python virtual environment.
- Install dependencies: `pip install -r requirements.txt`.
## Train and Predict
- Dataset
- Download the dataset to this directory.
```bash
wget 120.27.214.45/Data/re/standard/data.tar.gz
tar -xzvf data.tar.gz
```
- The dataset is stored in `data/origin`:
- `train.csv`: Training set
- `valid.csv `: Validation set
- `test.csv`: Test set
- `relation.csv`: Relation labels
- Training
- Parameters for training are in the `conf` folder and users can modify them before training.
- If using LM, modify 'lm_file' to use the local model.
- Logs for training are in the `log` folder and the trained model is saved in the `checkpoints` folder.
```bash
python run.py
```
- Prediction
```bash
python predict.py
```
## Models
1. CNN
2. RNN
3. Capsule
4. GCN
5. Transformer
6. Pre-trained Model (BERT)

View File

@ -0,0 +1,63 @@
## 快速上手
<p align="left">
<b> <a href="https://github.com/zjunlp/DeepKE/blob/main/example/re/standard/README.md">English</a> | 简体中文 </b>
</p>
### 环境依赖
> python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
### 克隆代码
```
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/standard
```
### 使用pip安装
首先创建python虚拟环境再进入虚拟环境
- 安装依赖: ```pip install -r requirements.txt```
### 使用数据进行训练预测
- 存放数据: 可先下载数据 ```wget 120.27.214.45/Data/re/standard/data.tar.gz```在此目录下
`data/origin` 文件夹下存放训练数据:
- `train.csv`:存放训练数据集
- `valid.csv`:存放验证数据集
- `test.csv`:存放测试数据集
- `relation.csv`:存放关系种类
- 开始训练:```python run.py``` (训练所用到参数都在conf文件夹中修改即可使用lm时可修改'lm_file'使用下载至本地的模型)
- 每次训练的日志保存在 `logs` 文件夹内,模型结果保存在 `checkpoints` 文件夹内。
- 进行预测 ```python predict.py```
## 模型内容
1、CNN
2、RNN
3、Capsule
4、GCN
5、Transformer
6、预训练模型