add tutorials into sdvanced
This commit is contained in:
parent
b017c73100
commit
73374528d0
|
@ -9,16 +9,18 @@ Model
|
|||
-------------
|
||||
|
||||
As a common practice with paddlepaddle, models are implemented as subclasses
|
||||
of ``paddle.nn.Layer``. More complicated models, it is recommended to split
|
||||
the model into different components.
|
||||
of ``paddle.nn.Layer``. Models could be simple, like a single layer RNN. For
|
||||
complicated models, it is recommended to split the model into different
|
||||
components.
|
||||
|
||||
For a encoder-decoder model, it is natural to split it into the encoder and
|
||||
the decoder. For a model composed of several similar layers, it is natural to
|
||||
extract the sublayer as a seperate layer.
|
||||
extract the sublayer as a separate layer.
|
||||
|
||||
There are two common ways to define a model which consists of several modules.
|
||||
|
||||
#. Define a module given the specifications.
|
||||
#. Define a module given the specifications. Here is an example with multilayer
|
||||
perceptron.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
@ -32,11 +34,11 @@ There are two common ways to define a model which consists of several modules.
|
|||
|
||||
module = MLP(16, 32, 4) # intialize a module
|
||||
|
||||
When the module is intended to be a generic reusable layer that can be
|
||||
When the module is intended to be a generic and reusable layer that can be
|
||||
integrated into a larger model, we prefer to define it in this way.
|
||||
|
||||
For considerations of readability and usability, we strongly recommend **NOT** to
|
||||
pack specifications into a single object. Here's an example below.
|
||||
For considerations of readability and usability, we strongly recommend
|
||||
**NOT** to pack specifications into a single object. Here's an example below.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
@ -48,16 +50,17 @@ There are two common ways to define a model which consists of several modules.
|
|||
def forward(self, x):
|
||||
return self.linear2(paddle.tanh(self.linear1(x))
|
||||
|
||||
For a module defined in this way, it's harder for the user to initialize a
|
||||
instance. The user have to read the code to check what attributes are used.
|
||||
For a module defined in this way, it's harder for the user to initialize an
|
||||
instance. Users have to read the code to check what attributes are used.
|
||||
|
||||
Code in this style tend to pass a huge config object to initialize every
|
||||
module used in an experiment, thought each module may not need the whole
|
||||
configuration.
|
||||
Also, code in this style tend to be abused by passing a huge config object
|
||||
to initialize every module used in an experiment, thought each module may
|
||||
not need the whole configuration.
|
||||
|
||||
We prefer to be explicit.
|
||||
|
||||
#. Define a module as a combination given its components.
|
||||
#. Define a module as a combination given its components. Here is an example
|
||||
for a sequence-to-sequence model.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
@ -75,15 +78,78 @@ There are two common ways to define a model which consists of several modules.
|
|||
decoder = Decoder(...)
|
||||
model = Seq2Seq(encoder, decoder) # compose two components
|
||||
|
||||
When a model is a complicated one made up of several components, each of which
|
||||
When a model is a complicated and made up of several components, each of which
|
||||
has a separate functionality, and can be replaced by other components with the
|
||||
same functionality, we prefer to define it in this way.
|
||||
|
||||
Data
|
||||
-------------
|
||||
|
||||
Another critical componnet for a deep learning project is data. As a common
|
||||
practice, we use the dataset and dataloader abstraction.
|
||||
|
||||
Dataset
|
||||
^^^^^^^^^^
|
||||
Dataset is the representation of a set of examples used for a projet. In most of
|
||||
the cases, dataset is a collection of examples. Dataset is an object which has
|
||||
methods below.
|
||||
|
||||
#. ``__len__``, to get the size of the dataset.
|
||||
#. ``__getitem__``, to get an example by key or index.
|
||||
|
||||
Examples is a record consisting of several fields. In practice, we usually
|
||||
represent it as a namedtuple for convenience, yet dict and user-defined object
|
||||
are also supported.
|
||||
|
||||
We define our own dataset by subclassing ``paddle.io.Dataset``.
|
||||
|
||||
DataLoader
|
||||
^^^^^^^^^^^
|
||||
In deep learning practice, models are trained with minibatches. DataLoader
|
||||
meets the need for iterating the dataset in batches. It is done by providing
|
||||
a sampler and a batch function in addition to a dataset.
|
||||
|
||||
#. sampler, sample indices or keys used to get examples from the dataset.
|
||||
#. batch function, transform a list of examples into a batch.
|
||||
|
||||
An commonly used sampler is ``RandomSampler``, it shuffles all the valid
|
||||
indices and then iterate over them sequentially. ``DistributedBatchSampler`` is
|
||||
a sampler used for distributed data parallel training, when the sampler handles
|
||||
data sharding in a dynamic way.
|
||||
|
||||
Batch function is used to transform selected examples into a batch. For a simple
|
||||
case where an example is composed of several fields, each of which is represented
|
||||
by an fixed size array, batch function can be simply stacking each field. For
|
||||
cases where variable size arrays are included in the example, batching could
|
||||
invlove padding and stacking. While in theory, batch function can do more like
|
||||
randomly slicing, etc.
|
||||
|
||||
For a custom dataset used for a custom model, it is required to define a batch
|
||||
function for it.
|
||||
|
||||
Config
|
||||
-------------
|
||||
|
||||
It's common to change the running configuration to compare results. To keep track
|
||||
of running configuration, we use ``yaml`` configuration files.
|
||||
|
||||
Also, we want to interact with command line options. Some options that usually
|
||||
change according to running environments is provided by command line arguments.
|
||||
In addition, we wan to override an option in the config file without editing
|
||||
it.
|
||||
|
||||
Taking these requirements in to consideration, we use `yacs <https://github.com/rbgirshick/yacs>`_
|
||||
as a confi management tool. Other tools like `omegaconf <https://github.com/omry/omegaconf>`_
|
||||
are also powerful and have similar functions.
|
||||
|
||||
In each example provided, there is a ``config.py``, where the default config is
|
||||
defined. If you want to get the default config, import ``config.py`` and call
|
||||
``get_cfg_defaults()`` to get the default config. Then it can be updated with
|
||||
yaml config file or command line arguments if needed.
|
||||
|
||||
For details about how to use yacs in experiments, see `yacs <https://github.com/rbgirshick/yacs>`_.
|
||||
|
||||
|
||||
Experiment
|
||||
--------------
|
||||
|
||||
|
|
|
@ -1,24 +1,30 @@
|
|||
===========
|
||||
Tutorials
|
||||
===========
|
||||
|
||||
Basic Usage
|
||||
-------------------
|
||||
===========
|
||||
|
||||
Pretrained models are provided in a archive. Extract it to get a folder like this::
|
||||
This section shows how to use pretrained models provided by parakeet and make
|
||||
inference with them.
|
||||
|
||||
Pretrained models are provided in a archive. Extract it to get a folder like
|
||||
this::
|
||||
|
||||
checkpoint_name/
|
||||
├──config.yaml
|
||||
└──step-310000.pdparams
|
||||
|
||||
The ``config.yaml`` stores the config used to train the model, the ``step-N.pdparams`` is the parameter file, where N is the steps it has been trained.
|
||||
The ``config.yaml`` stores the config used to train the model, the
|
||||
``step-N.pdparams`` is the parameter file, where N is the steps it has been
|
||||
trained.
|
||||
|
||||
The example code below shows how to use the models for prediction.
|
||||
|
||||
text to spectrogram
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The code below show how to use a transformer_tts model. After loading the pretrained model, use ``model.predict(sentence)`` to generate spectrogram (in numpy.ndarray format), which can be further used to synthesize waveflow.
|
||||
The code below show how to use a transformer_tts model. After loading the
|
||||
pretrained model, use ``model.predict(sentence)`` to generate spectrograms
|
||||
(in numpy.ndarray format), which can be further used to synthesize raw audio
|
||||
with a vocoder.
|
||||
|
||||
>>> import parakeet
|
||||
>>> from parakeet.frontend import English
|
||||
|
@ -43,7 +49,8 @@ The code below show how to use a transformer_tts model. After loading the pretra
|
|||
vocoder
|
||||
^^^^^^^^^^
|
||||
|
||||
Like the example above, after loading the pretrained ConditionalWaveFlow model, call ``model.predict(mel)`` to synthesize waveflow (in numpy.ndarray format).
|
||||
Like the example above, after loading the pretrained ``ConditionalWaveFlow``
|
||||
model, call ``model.predict(mel)`` to synthesize raw audio (in wav format).
|
||||
|
||||
>>> import soundfile as df
|
||||
>>> from parakeet.models import ConditionalWaveFlow
|
||||
|
@ -60,8 +67,3 @@ Like the example above, after loading the pretrained ConditionalWaveFlow model,
|
|||
>>> sf.write(audio_path, audio, config.data.sample_rate)
|
||||
|
||||
For more details on how to use the model, please refer the documentation.
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -19,7 +19,7 @@ Parakeet
|
|||
:maxdepth: 1
|
||||
|
||||
install
|
||||
tutorials
|
||||
basic
|
||||
advanced
|
||||
|
||||
.. toctree::
|
||||
|
|
|
@ -4,8 +4,8 @@ Installation
|
|||
|
||||
|
||||
Install PaddlePaddle
|
||||
-------------------
|
||||
Parakeet requires PaddlePaddle as its backend. Not that 2.0rc or newer versions
|
||||
------------------------
|
||||
Parakeet requires PaddlePaddle as its backend. Not that 2.0.0rc1 or newer versions
|
||||
of paddle is required.
|
||||
|
||||
Since paddlepaddle has multiple packages depending on the device (cpu or gpu)
|
||||
|
@ -50,7 +50,7 @@ are listed below.
|
|||
# ubuntu, debian
|
||||
sudo apt-get install libsndfile1
|
||||
|
||||
# centos, fedora,
|
||||
# centos, fedora
|
||||
sudo yum install libsndfile
|
||||
|
||||
# openSUSE
|
||||
|
@ -64,10 +64,10 @@ Install Parakeet
|
|||
|
||||
There are two ways to install parakeet according to the purpose of using it.
|
||||
|
||||
1. If you want to run experiments provided by parakeet or add new models and
|
||||
experiments, it is recommended to clone the project from github
|
||||
(`Parakeet <https://github.com/PaddlePaddle/Parakeet>`_), and install it in
|
||||
editable mode.
|
||||
#. If you want to run experiments provided by parakeet or add new models and
|
||||
experiments, it is recommended to clone the project from github
|
||||
(`Parakeet <https://github.com/PaddlePaddle/Parakeet>`_), and install it in
|
||||
editable mode.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
|
@ -75,11 +75,9 @@ editable mode.
|
|||
cd Parakeet
|
||||
pip install -e .
|
||||
|
||||
|
||||
#. If you only need to use the models for inference by parakeet, install from
|
||||
pypi is recommended。
|
||||
pypi is recommended.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install paddle-parakeet
|
||||
|
||||
|
|
Loading…
Reference in New Issue