Merge branch 'fix' into 'master'

change the image size in transformer_tts readme

See merge request !43
This commit is contained in:
liuyibing01 2020-03-10 17:00:33 +08:00
commit ed1998da13
1 changed files with 6 additions and 1 deletions

View File

@ -10,8 +10,13 @@ wget https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2
tar xjvf LJSpeech-1.1.tar.bz2
```
## Model Architecture
<div align="center" name="TransformerTTS model architecture">
<img src="./images/model_architecture.jpg" width=400 height=600 /> <br>
</div>
<div align="center" >
TransformerTTS model architecture
</div>
![TransformerTTS model architecture](./images/model_architecture.jpg)
The model adopts the multi-head attention mechanism to replace the RNN structures and also the original attention mechanism in [Tacotron2](https://arxiv.org/abs/1712.05884). The model consists of two main parts, encoder and decoder. We also implement the CBHG model of Tacotron as the vocoder part and convert the spectrogram into raw wave using Griffin-Lim algorithm.
## Project Structure