update README

This commit is contained in:
chenfeiyu 2020-02-13 08:49:39 +00:00 committed by liuyibing01
parent f1801569f2
commit 7837de9387
1 changed files with 7 additions and 5 deletions

View File

@ -28,7 +28,7 @@ You can choose to install via pypi or clone the repository and install manually.
pip install -e .
```
### cmudict
### Download cmudict for nltk
You also need to download cmudict for nltk, because convert text into phonemes with `cmudict`.
```python
@ -37,7 +37,7 @@ nltk.download("punkt")
nltk.download("cmudict")
```
## dataset
## Dataset
We experiment with the LJSpeech dataset. Download and unzip [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
@ -48,20 +48,22 @@ tar xjvf LJSpeech-1.1.tar.bz2
## Model Architecture
![DeepVoice3 model architecture](./_images/model_architecture.png)
![DeepVoice3 model architecture](./images/model_architecture.png)
The model consists of an encoder, a decoder and a converter (and a speaker embedding for multispeaker models). The encoder, together with the decoder forms the seq2seq part of the model, and the converter forms the postnet part.
## Project Structure
```text
├── data.py data_processing
├── ljspeech.yaml (example) configuration file
├── sentences.txt sample sentences
├── synthesis.py script to synthesize waveform from text
├── train.py script to train a model
└── utils.py utility functions
```
## train
## Train
Train the model using train.py, follow the usage displayed by `python train.py --help`.
@ -100,7 +102,7 @@ optional arguments:
5. `--device` is the device (gpu id) to use for training. `-1` means CPU.
## synthesis
## Synthesis
```text
usage: synthesis.py [-h] [-c CONFIG] [-g DEVICE] checkpoint text output_path