Compare commits

...

94 Commits

Author SHA1 Message Date
Feiyu Chan d726863138
fix a config key error (#110) 2021-05-18 18:13:36 +08:00
chenfeiyu fa6ddf5b0c bump version string to 0.3.0 2021-05-17 11:33:39 +08:00
chenfeiyu c02adfdad8 Merge branch 'develop' of https://github.com/PaddlePaddle/Parakeet into develop 2021-05-17 11:29:31 +08:00
chenfeiyu e1a7c296fe simplify text processing code and update notebook 2021-05-13 17:06:34 +08:00
chenfeiyu 6a1fb158d9 format code with pre-commit 2021-05-13 16:22:56 +08:00
chenfeiyu 73ca693395 add praatio into requirements for running the experiments 2021-05-11 22:46:09 +08:00
chenfeiyu 2f644e1b8b refine READMEs and clean code 2021-05-11 22:44:02 +08:00
chenfeiyu 8bcbcf8a86 add links to downlaod pretrained models 2021-05-07 16:49:11 +08:00
chenfeiyu 71a87559da update README 2021-05-07 16:28:23 +08:00
chenfeiyu 664fc20c0a update doc 2021-05-07 16:16:58 +08:00
chenfeiyu b9aa61b5eb update docstrings for tacotron 2021-05-07 16:08:31 +08:00
chenfeiyu f197e4d04f update README and doc 2021-05-07 15:35:47 +08:00
chenfeiyu ef1ea56ed6 fix typos and docs 2021-05-07 15:03:54 +08:00
chenfeiyu 38831bf8b6 add extra_config keys into the default config of tacotron 2021-04-30 14:27:08 +08:00
chenfeiyu b88a0f90aa add STFT back 2021-04-29 17:54:07 +08:00
iclementine 42092f1f5b update README for examples/ge2e 2021-04-29 17:15:18 +08:00
iclementine b1304cb449 add images for exampels/tacotron2_aishell3's README 2021-04-29 17:09:40 +08:00
iclementine cab12c2dfd update tacotron_aishell3's README 2021-04-29 17:00:26 +08:00
iclementine ba7639b994 update tacotron2 2021-04-29 16:43:03 +08:00
iclementine 123bbe994f update tacotron2 from_pretrained, update setup.py 2021-04-29 16:04:32 +08:00
iclementine 701376f401 remove tacotron2_msp 2021-04-28 20:05:12 +08:00
iclementine 77eb13d95d format code 2021-04-28 20:02:29 +08:00
chenfeiyu cbe531158e add plot_multiple_attentions and update visualization code in transformer_tts 2021-04-27 17:40:50 +08:00
chenfeiyu 263d3eb88b add an optional to alter the loss and model structure of tacotron2, add an alternative config 2021-04-26 21:18:29 +08:00
chenfeiyu 4fc86abf5a Merge branch 'baker' of https://github.com/iclementine/Parakeet into baker 2021-04-25 11:11:36 +08:00
chenfeiyu 85649725fb add voice cloning notebook 2021-04-25 11:11:24 +08:00
iclementine cf01a0da22 add more details to thr README, fix some preprocess scripts 2021-04-25 11:00:42 +08:00
iclementine 4426417da1 WIP: add README 2021-04-22 17:40:36 +08:00
iclementine e8a9a118bb clean code for data processing 2021-04-22 17:20:34 +08:00
iclementine 56f2552201 fix argument name 2021-04-22 14:50:52 +08:00
chenfeiyu c2560e8aa2 fix argument order 2021-04-22 13:46:51 +08:00
iclementine 3a744dbf30 clean code 2021-04-22 13:25:25 +08:00
iclementine 764c35e99e move tacotron2_msp 2021-04-22 11:00:33 +08:00
chenfeiyu c8627fdd75 remove imports to deleted modules 2021-04-20 20:12:57 +08:00
chenfeiyu 16b4d4eecb remove files not included in this release 2021-04-20 17:12:22 +08:00
chenfeiyu 6b3999217b remove imports that are removed 2021-04-20 15:54:55 +08:00
iclementine e992e17456 resolve conflict 2021-04-19 20:17:21 +08:00
iclementine 0eea7cc373 fix typos 2021-04-19 20:15:46 +08:00
iclementine f8f3ec4709 Merge branch 'baker' of github.com:iclementine/Parakeet into baker 2021-04-19 20:12:07 +08:00
chenfeiyu 9da118e53b merge wavenet 2021-04-19 20:09:01 +08:00
chenfeiyu 3741cc49ca change wavenet to use on-the-fly prepeocessing 2021-04-19 19:58:36 +08:00
iclementine b53b274585 change batch_text_id, batch_spec, batch_wav to include valid lengths in the returned value 2021-04-19 17:06:52 +08:00
iclementine 6749ce40ea add audio datasets 2021-04-19 16:17:30 +08:00
iclementine 49f2c4b3fb change stft to use conv1d 2021-04-16 15:01:10 +08:00
iclementine e06c6cdfe1 WIP:update hifigan 2021-04-15 17:23:42 +08:00
iclementine 68497f89a4 WIP: add hifigan 2021-04-14 20:59:26 +08:00
chenfeiyu e54f23befd update collate function, data loader not does not convert nested list into numpy array. 2021-04-14 20:51:13 +08:00
chenfeiyu c6965e2c5a fix fmax for example/waveflow 2021-04-14 20:50:38 +08:00
iclementine b674f63d74 add 2 frontend 2021-04-08 04:59:29 +08:00
iclementine 184745f42b add gst layer 2021-04-08 04:59:03 +08:00
iclementine dc3b798f82 add global condition support for tacotron2 2021-04-08 04:58:44 +08:00
chenfeiyu 5011f16c10 minor fix 2021-04-07 10:55:05 +08:00
iclementine 4d3014f4d5 add new trainer 2021-04-03 16:19:46 +08:00
iclementine 27e0201d0d format code for tacotron_vctk, add plot_waveform to display 2021-04-02 15:46:28 +08:00
iclementine a3fae49022 merge refactor_tacotron 2021-04-02 11:48:16 +08:00
iclementine 274d8dce07 update experiment and display 2021-04-02 11:37:48 +08:00
iclementine 15b205d6e0 Merge branch 'develop' into baker 2021-04-02 11:23:21 +08:00
chenfeiyu 8d67066765 add example for baker and aishell3 2021-04-02 11:06:34 +08:00
chenfeiyu 9babec0f98 fix text log extention name 2021-04-01 13:49:52 +08:00
chenfeiyu 752272de98 fix bugs 2021-04-01 13:15:06 +08:00
iclementine e0052ccedf fix typos 2021-03-31 19:38:12 +08:00
iclementine a834e132b9 fix root path 2021-03-31 19:36:48 +08:00
iclementine dd73ee6611 fix root path 2021-03-31 19:35:59 +08:00
iclementine 883bc16d24 fix root path 2021-03-31 19:33:33 +08:00
iclementine 9798d07337 fix visualizer 2021-03-31 19:32:23 +08:00
iclementine f84d460613 fix class name 2021-03-31 19:31:16 +08:00
iclementine 327c7a5ce4 fix indentation 2021-03-31 19:29:09 +08:00
iclementine 4a039b6407 add vctk example for refactored tacotron 2021-03-31 17:34:19 +08:00
iclementine 7cc3e8c340 add a simple strategy to support multispeaker for tacotron. 2021-03-31 15:23:41 +08:00
iclementine 2dd393349f Merge branch 'develop' into refactor_tacotron 2021-03-30 16:01:22 +08:00
iclementine e3f7bb5a51 simplify visualization code 2021-03-30 15:56:14 +08:00
chenfeiyu 0fdb86834b Merge branch 'develop' into baker 2021-03-30 14:39:11 +08:00
chenfeiyu b5dd0cc197 fix speaker encoder and add support for 2 more datasets 2021-03-30 14:38:44 +08:00
iclementine 4757f08550 Merge branch 'develop' into baker 2021-03-29 11:17:51 +08:00
iclementine 59ed247840 fix lstm speaker encoder 2021-03-29 11:17:23 +08:00
iclementine ab85d5ca13 Merge branch 'develop' into baker 2021-03-29 11:13:57 +08:00
iclementine 5443e23fb7 fix lstm speaker encoder 2021-03-29 11:12:02 +08:00
iclementine 6defef4944 Merge branch 'baker' of github.com:iclementine/Parakeet into baker 2021-03-29 10:49:24 +08:00
chenfeiyu 489fb69f55 Merge branch 'develop' into baker 2021-03-29 10:49:34 +08:00
iclementine a9a78742fa Merge branch 'develop' into baker 2021-03-29 10:42:17 +08:00
iclementine 2475da3322 add ge2e 2021-03-27 17:39:37 +08:00
chenfeiyu a005cc88a3 WIP: baker 2021-03-27 12:43:03 +08:00
iclementine 2b62fbb614 1. change the default min value of LogMagnitude to 1e-5;
2. remove stop logit prediction from tacotron2 model.
2021-03-23 10:44:22 +08:00
iclementine da63cfa42e add an embedding layer. 2021-03-22 21:39:22 +08:00
iclementine f9d6160916 add an option to normalize volume when loading audio. 2021-03-22 21:38:28 +08:00
iclementine 086fbf8e35 refactoring code 2021-03-22 21:23:46 +08:00
chenfeiyu 3c60fec900 remove bn in postnet 2021-02-27 03:26:41 +08:00
chenfeiyu 929165b64a 1. remove space from numericalized representation;
2. fix decoder paddign mask's unsqueeze dim.
2021-02-27 02:59:38 +08:00
chenfeiyu ae9e218073 use emb add in tacotron2 2021-02-26 18:08:26 +08:00
chenfeiyu 40237c40b0 Merge branch 'develop' of https://github.com/PaddlePaddle/Parakeet into baker 2021-02-26 11:07:03 +08:00
chenfeiyu 9e4d5a3d8a fix experiments for waveflow and wavenet, only write visual log in rank-0 2021-02-21 17:30:13 +08:00
chenfeiyu 6a92fde9b2 Merge branch 'develop' of https://github.com/PaddlePaddle/Parakeet into baker 2021-02-18 19:58:27 +08:00
chenfeiyu 25bd8987a6 Merge branch 'develop' of https://github.com/PaddlePaddle/Parakeet into baker 2021-02-18 19:51:56 +08:00
chenfeiyu 239703be8b hacky thing, add tone support for acoustic model 2021-02-10 22:58:08 +08:00
4 changed files with 4 additions and 4 deletions

View File

@ -39,7 +39,7 @@ _C.model = CN(
d_ffn=1024, # encoder_d_ffn & decoder_d_ffn
encoder_layers=4, # number of transformer encoder layer
decoder_layers=4, # number of transformer decoder layer
d_prenet=256, # decprenet's hidden size (d_mel=>d_prenet=>d_decoder)
d_prenet=256, # decoder prenet's hidden size (n_mels=>d_prenet=>d_decoder)
d_postnet=256, # decoder postnet(cnn)'s internal channel
postnet_layers=5, # decoder postnet(cnn)'s layer
postnet_kernel_size=5, # decoder postnet(cnn)'s kernel size

View File

@ -38,7 +38,7 @@ def create_dataset(config, source_path, target_path, verbose=False):
processor = AudioProcessor(
sample_rate=config.data.sample_rate,
n_fft=config.data.n_fft,
n_mels=config.data.d_mel,
n_mels=config.data.n_mels,
win_length=config.data.win_length,
hop_length=config.data.hop_length,
fmax=config.data.fmax,

View File

@ -12,6 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
__version__ = "0.2.0-beta.0"
__version__ = "0.3.0"
from parakeet import audio, data, datasets, frontend, models, modules, training, utils

View File

@ -571,7 +571,7 @@ class TransformerTTS(nn.Layer):
frontend,
d_encoder=config.model.d_encoder,
d_decoder=config.model.d_decoder,
d_mel=config.data.d_mel,
d_mel=config.data.n_mels,
n_heads=config.model.n_heads,
d_ffn=config.model.d_ffn,
encoder_layers=config.model.encoder_layers,