deepke/example/re/document
xxupiano ea6321b97f Update README 2022-01-11 17:01:47 +08:00
..
conf Add files via upload 2021-10-10 19:18:00 +08:00
README.md Update README 2022-01-11 17:01:47 +08:00
README_CN.md Update README 2022-01-11 17:01:47 +08:00
predict.py test 2021-10-10 20:51:09 +08:00
requirements.txt Add files via upload 2021-10-10 19:18:00 +08:00
run.py add wandb 2021-11-30 21:39:29 +08:00

README.md

Easy Start

English | 简体中文

Requirements

python == 3.8

  • torch == 1.5.0
  • transformers == 3.4.0
  • opt-einsum == 3.3.0
  • ujson
  • deepke

Download Code

git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/document

Install with Pip

  • Create and enter the python virtual environment.
  • Install dependencies: pip install -r requirements.txt.

Train and Predict

  • Dataset

    • Download the dataset to this directory.

      wget 120.27.214.45/Data/re/document/data.tar.gz
      tar -xzvf data.tar.gz
      
    • The dataset DocRED is stored in data:

      • dev.jsonValidation set

      • rel_info.jsonRelation set

      • rel2id.jsonRelation labels - ID

      • test.jsonTest set

      • train_annotated.jsonTraining set annotated manually

      • train_distant.json: Training set generated by distant supervision

  • Training

    • Parameters, model paths and configuration for training are in the conf folder and users can modify them before training.

    • Training on DocRED

      python run.py
      
    • The trained model is stored in the current directory by default.

    • Start to train from last-trained model

      modify train_from_saved_model in .yaml as the path of the last-trained model

    • Logs for training are stored in the current directory by default and the path can be configured by modifying log_dir in .yaml

  • Prediction

    python predict.py
    
    • After prediction, generated result.json is stored in the current directory

Model

DocuNet