File size: 3,228 Bytes
17ed7d8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# Tutorial For Nervous Beginners

## Installation

User friendly installation. Recommended only for synthesizing voice.

```bash

$ pip install TTS

```

Developer friendly installation.

```bash

$ git clone https://github.com/coqui-ai/TTS

$ cd TTS

$ pip install -e .

```

## Training a `tts` Model

A breakdown of a simple script that trains a GlowTTS model on the LJspeech dataset. See the comments for more details.

### Pure Python Way

0. Download your dataset.

    In this example, we download and use the LJSpeech dataset. Set the download directory based on your preferences.


    ```bash

    $ python -c 'from TTS.utils.downloaders import download_ljspeech; download_ljspeech("../recipes/ljspeech/");'

    ```


1. Define `train.py`.

    ```{literalinclude} ../../recipes/ljspeech/glow_tts/train_glowtts.py

    ```


2. Run the script.

    ```bash

    CUDA_VISIBLE_DEVICES=0 python train.py

    ```


    - Continue a previous run.

        ```bash

        CUDA_VISIBLE_DEVICES=0 python train.py --continue_path path/to/previous/run/folder/

        ```


    - Fine-tune a model.

        ```bash

        CUDA_VISIBLE_DEVICES=0 python train.py --restore_path path/to/model/checkpoint.pth

        ```


    - Run multi-gpu training.

        ```bash

        CUDA_VISIBLE_DEVICES=0,1,2 python -m trainer.distribute --script train.py

        ```


### CLI Way

We still support running training from CLI like in the old days. The same training run can also be started as follows.

1. Define your `config.json`

    ```json

    {

        "run_name": "my_run",

        "model": "glow_tts",

        "batch_size": 32,

        "eval_batch_size": 16,

        "num_loader_workers": 4,

        "num_eval_loader_workers": 4,

        "run_eval": true,

        "test_delay_epochs": -1,

        "epochs": 1000,

        "text_cleaner": "english_cleaners",

        "use_phonemes": false,

        "phoneme_language": "en-us",

        "phoneme_cache_path": "phoneme_cache",

        "print_step": 25,

        "print_eval": true,

        "mixed_precision": false,

        "output_path": "recipes/ljspeech/glow_tts/",

        "datasets":[{"formatter": "ljspeech", "meta_file_train":"metadata.csv", "path": "recipes/ljspeech/LJSpeech-1.1/"}]

    }

    ```


2. Start training.
    ```bash

    $ CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py --config_path config.json

    ```


## Training a `vocoder` Model

```{literalinclude} ../../recipes/ljspeech/hifigan/train_hifigan.py

```

❗️ Note that you can also use ```train_vocoder.py``` as the ```tts``` models above.

## Synthesizing Speech

You can run `tts` and synthesize speech directly on the terminal.

```bash

$ tts -h # see the help

$ tts --list_models  # list the available models.

```

![cli.gif](https://github.com/coqui-ai/TTS/raw/main/images/tts_cli.gif)


You can call `tts-server` to start a local demo server that you can open it on
your favorite web browser and 🗣️.

```bash

$ tts-server -h # see the help

$ tts-server --list_models  # list the available models.

```
![server.gif](https://github.com/coqui-ai/TTS/raw/main/images/demo_server.gif)