Taein-TTS

License

Description

Taein-TTS is a project aimed at creating a text-to-speech (TTS) system that reads sentences in my own voice. This repository includes pre-trained models that have been trained using my voice.

Table of Contents

Installation

This README focuses on guiding you through the process of synthesizing speech using pre-trained models, rather than detailing the model training process.

  1. Clone the huggingface repository: https://huggingface.co/icecream0910/taein-tts

  2. Modify the run-server.bat batch file in the /server directory to match your actual file paths.

    For example, if your server folder is at C:\myown-tts\server, update the file as follows:

    @echo off
    setlocal
    cd /D "%~dp0"
    set MECAB_KO_DIC_PATH=.\mecab\mecab-ko-dic -r .\mecab\mecabrc
    set TTS_MODEL_FILE=C:\myown-tts\server\models\glowtts-v2\best_model.pth.tar
    set TTS_MODEL_CONFIG=C:\myown-tts\server\models\glowtts-v2\config.json
    set VOCODER_MODEL_FILE=C:\myown-tts\server\models\hifigan-v2\best_model.pth.tar
    set VOCODER_MODEL_CONFIG=C:\myown-tts\server\models\hifigan-v2\config.json
    server.exe
    endlocal
    
  3. Update the glowtts-v2/config.json and hifigan-v2/config.json files in the /server/models/ directory with your actual file paths.

    Ensure you double the backslash (\\) in the file paths, as shown below:

    • For glowtts-v2/config.json:
    "stats_path": "C:\\mydata\\tts-server\\models\\glowtts-v2\\scale_stats.npy"
    
    • For hifigan-v2/config.json:
    "stats_path": "C:\\mydata\\tts-server\\models\\hifigan-v2\\scale_stats.npy"
    

Usage

To start the TTS server, execute run-server.bat. Once the server is running, you will see the message INFO:werkzeug: * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit) in the command prompt, indicating that the speech synthesis feature is available through the TTS server. To stop the server, press CTRL+C in the command prompt.

API

  • Text preprocessing: /tts-server/api/process-text

    Splits sentences and removes special characters to automatically stitch together and playback multi-line sentences as you type.

  • Text Inference: /tts-server/api/infer-glowtts

    Synthesizes text to speech. Send the text to be synthesized in the text parameter of the URL.

    Example:

    http://localhost:5000/tts-server/api/infer-glowtts?text=hello
    

Text Inference Demo Page

Visit http://localhost:5000/ for a demo.

Contributing

  1. Fork the repository (https://github.com/icecream0910/myown-tts/fork).
  2. Create a new branch: git checkout -b feature/<featureName>.
  3. Commit your changes: git commit -am 'Add <featureName>'.
  4. Push to the branch: git push origin feature/<featureName>.
  5. Submit a pull request.

License

This project is licensed under the MIT License.

References

This implementation draws inspiration from the following repositories:

The datasets below are distributed under the CC-BY 2.0 license, with the original text data provided by the Korea Information Society Development Institute's AI Hub, including Korean dialogue text data and Korean-English translation (parallel) corpus text data.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.