Commit Graph

81 Commits

Author SHA1 Message Date
TianYuan c497fd843d format 2021-08-17 09:54:07 +00:00
TianYuan 796fafbac8 fix pwg 2021-08-09 10:46:52 +00:00
TianYuan 2eb899b0b7 Merge branch 'develop' of https://github.com/PaddlePaddle/Parakeet into fastspeech2_test 2021-08-03 10:50:51 +00:00
chenfeiyu 133294340c add to_static export for speedyspeech and pwg, at the cost of making lots of comprimises 2021-07-21 16:57:35 +08:00
TianYuan 6553d1d723 format docstrings 2021-07-13 07:55:56 +00:00
chenfeiyu 8b7dabbd8d an inference interface for speedyspeech and pwg 2021-07-12 17:33:00 +08:00
TianYuan 3af3c29a94 add fastspeech2 2021-07-12 06:19:26 +00:00
chenfeiyu 6c21d80025 add WIP: speedyspeech model and example with baker dataset. 2021-07-08 16:47:08 +08:00
chenfeiyu ef51e1ab13 refined training module 2021-06-30 13:08:23 +08:00
chenfeiyu a738954001 1. change default data layout to channel last in preprocessing;
2. add Summary and DictSummary for aggrelation of evaluation losses;
3. add unittest for report ans scope.
2021-06-18 09:44:32 +00:00
chenfeiyu 8dbcc9bccb add profiling 2021-06-16 09:40:47 +00:00
chenfeiyu 3c964fde54 add parallel wavegan model 2021-06-10 04:08:05 +08:00
chenfeiyu 759999c738 STFT and MelScale: register filters as buffer. 2021-06-10 04:06:06 +08:00
Feiyu Chan 4f288a6d4f
add ge2e and tacotron2_aishell3 example (#107)
* hacky thing, add tone support for acoustic model

* fix experiments for waveflow and wavenet, only write visual log in rank-0

* use emb add in tacotron2

* 1. remove space from numericalized representation;
2. fix decoder paddign mask's unsqueeze dim.

* remove bn in postnet

* refactoring code

* add an option to normalize volume when loading audio.

* add an embedding layer.

* 1. change the default min value of LogMagnitude to 1e-5;
2. remove stop logit prediction from tacotron2 model.

* WIP: baker

* add ge2e

* fix lstm speaker encoder

* fix lstm speaker encoder

* fix speaker encoder and add support for 2 more datasets

* simplify visualization code

* add a simple strategy to support multispeaker for tacotron.

* add vctk example for refactored tacotron

* fix indentation

* fix class name

* fix visualizer

* fix root path

* fix root path

* fix root path

* fix typos

* fix bugs

* fix text log extention name

* add example for baker and aishell3

* update experiment and display

* format code for tacotron_vctk, add plot_waveform to display

* add new trainer

* minor fix

* add global condition support for tacotron2

* add gst layer

* add 2 frontend

* fix fmax for example/waveflow

* update collate function, data loader not does not convert nested list into numpy array.

* WIP: add hifigan

* WIP:update hifigan

* change stft to use conv1d

* add audio datasets

* change batch_text_id, batch_spec, batch_wav to include valid lengths in the returned value

* change wavenet to use on-the-fly prepeocessing

* fix typos

* resolve conflict

* remove imports that are removed

* remove files not included in this release

* remove imports to deleted modules

* move tacotron2_msp

* clean code

* fix argument order

* fix argument name

* clean code for data processing

* WIP: add README

* add more details to thr README, fix some preprocess scripts

* add voice cloning notebook

* add an optional to alter the loss and model structure of tacotron2, add an alternative config

* add plot_multiple_attentions and update visualization code in transformer_tts

* format code

* remove tacotron2_msp

* update tacotron2 from_pretrained, update setup.py

* update tacotron2

* update tacotron_aishell3's README

* add images for exampels/tacotron2_aishell3's README

* update README for examples/ge2e

* add STFT back

* add extra_config keys into the default config of tacotron

* fix typos and docs

* update README and doc

* update docstrings for tacotron

* update doc

* update README

* add links to downlaod pretrained models

* refine READMEs and clean code

* add praatio into requirements for running the experiments

* format code with pre-commit

* simplify text processing code and update notebook
2021-05-13 17:49:50 +08:00
Li Fuchen e88cbace1c
Revert "bug fix: apply dropout to logits before softmax" 2021-01-07 15:19:49 +08:00
chenfeiyu 7b9e9c7a67 bug fix: apply dropout to logits before softmax 2020-12-31 16:52:21 +08:00
chenfeiyu 2421a936ed fix positional encoding naming conflict 2020-12-21 17:41:18 +08:00
iclementine e03e96d9e4 format all the code with yapf 2020-12-20 13:15:07 +08:00
Li Fuchen 544594ec54
Merge pull request #63 from iclementine/doc
update docstrings for models.wavenet.
2020-12-18 20:57:28 +08:00
iclementine 84ad4c9e65 1. update docstrings for models.wavenet;
2. remove unnecessary code;
3. fix typos
2020-12-18 20:55:27 +08:00
lfchener 1af9127ee6 add docstring for LocationSensitiveAttention 2020-12-18 17:31:51 +08:00
iclementine 310366bb54 1. fix format errors and typos 2020-12-18 16:09:38 +08:00
iclementine d78a8b4e1e 1. update documentations for paddle.modules;
2. update TransformerEncoder and  TransformerDecoder's implementation(mask and dropout).
2020-12-18 15:31:13 +08:00
iclementine 49c9cb38be use numpydoc instead of napoleon 2020-12-18 11:12:22 +08:00
iclementine bbc50faef2 add generated api_doc 2020-12-18 10:54:50 +08:00
lfchener e30d7ad48f merge upstream develop 2020-12-10 03:37:56 +00:00
chenfeiyu a1b827460c fix typos, move quantize/dequantize to moduels/audio 2020-12-09 21:05:39 +08:00
lfchener b12eda8423 add network of tacotron2 model 2020-12-09 09:08:17 +00:00
chenfeiyu 0287f46532 switch back to keras style sample weight 2020-12-05 21:08:10 +08:00
chenfeiyu a4a0bd8c98 add last bn for the decoder postnet, switch back to weighted mean 2020-12-05 14:00:08 +08:00
chenfeiyu c57e8e7350 fix transformer_tts' stop condition 2020-12-04 02:11:02 +08:00
chenfeiyu 810f979dba siwtch to keras style sample_weight in losses 2020-12-03 15:37:43 +08:00
chenfeiyu 6edc7d8474 switch back to standard implementation of positional encoding 2020-12-03 14:54:32 +08:00
chenfeiyu 404add2caa temporary fix for memory leak 2020-12-03 14:51:25 +08:00
chenfeiyu 0cdad602e2 fix a bug for changing reduction factor in transformner_tts 2020-11-03 11:18:46 +08:00
chenfeiyu 57d820f055 add support for channel last in batch_spec, and Conv1dBatchNorm 2020-10-30 15:13:57 +08:00
chenfeiyu c43216ae9b 1. API renaming Conv1d -> Conv1D, BatchNorm1d -> BatchNorm1D;
2. add losses in parakeet/modules;
3. fix a bug in phonetics;
4. TransformerTTS update: encoder dim can be different from decoder dim;
5. MultiHeadAttention in TransformerTTS: add k_input_dim & v_input_dim in __init__ to allow differemt feature sizes for k and v.
2020-10-22 05:04:45 +00:00
iclementine 53d0382fc7 clean code: remove deprecated modules 2020-10-15 23:07:30 +08:00
iclementine 5270774bb0 tested io for TransformerTTS 2020-10-15 22:48:09 +08:00
iclementine 40457227e6 move Conv1dBatchNorm to conv.py 2020-10-14 10:05:26 +08:00
iclementine f9087ea9a2 add masking functions 2020-10-13 15:53:18 +08:00
iclementine a8192c79cc WIP: refactor 2020-10-10 15:51:54 +08:00
chenfeiyu 45af3a43b2 fix WeightNormWrapper, stop using CacheDataset for deep voice 3, pin numba version to 0.47.0 2020-06-12 10:01:22 +00:00
Yibing Liu 9b8fd9f93d Upgrade waveflow to 1.8.0 2020-05-22 07:16:45 +00:00
chenfeiyu 6aac18278e refactor for deep voice 3, update wavenet and clarinet to use enable_dygraph 2020-05-20 12:37:19 +00:00
chenfeiyu ff1d66ea94 update for deepvoice3, fix weight norm 2020-05-06 08:36:43 +00:00
liuyibing01 be70b41fd1 Merge branch 'master' into 'master'
fixes for wavenet and modules

See merge request !47
2020-03-22 11:44:42 +08:00
chenfeiyu 2a1819a19c add warning in Conv1DCell and synthesis.py for wavenet and deepvoice 3(auto-regressive models) 2020-03-21 15:10:25 +00:00
chenfeiyu 67613951d5 minor fixes for wavent and modules 2020-03-21 11:52:15 +00:00
lifuchen 75d464221c modified the process of generating masks to speed up batching 2020-03-20 09:37:49 +00:00