ParakeetRebeccaRosario/examples/fastspeech2/baker
TianYuan 3d39385d5e add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
..
conf add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
README.md add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
batch_fn.py add fastspeech2 example 2021-07-19 06:31:52 +00:00
compute_statistics.py add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00
config.py add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00
fastspeech2_updater.py add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00
frontend.py add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
gen_duration_from_textgrid.py add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00
get_feats.py add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00
normalize.py add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
preprocess.py add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
preprocess.sh add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
run.sh add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
sentences.txt add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
simple.lexicon add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00
synthesize.py add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
synthesize.sh add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
synthesize_e2e.py add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
synthesize_e2e.sh add fastspeech2 example inference 2021-07-22 11:09:58 +00:00
train.py add fastspeech2 example data preprocess 2021-07-21 03:48:01 +00:00

README.md

FastSpeech2 with BZNSYP


Dataset


Download and Extract the datasaet.

Download BZNSYP from it's Official Website.

Get MFA result of BZNSYP and Extract it.

we use MFA to get durations for fastspeech2. you can download from here, or train your own MFA model reference to use_mfa example of our repo.

Preprocess the dataset.

Assume the path to the dataset is ~/datasets/BZNSYP. Assume the path to the MFA result of BZNSYP is ./baker_alignment_tone. Run the command below to preprocess the dataset.

./preprocess.sh

Train the model


./run.sh

Synthesize


we use parallel wavegan as the neural vocoder. synthesize.sh can synthesize waveform for metadata.jsonl. synthesize_e2e.sh can synthesize waveform for text list.

./synthesize.sh

or

./synthesize_e2e.sh

you can see the bash files for more datails of input parameter.

Pretrained Model