Spaces:

parler-tts
/

parler-tts-expresso

Running on Zero

sanchit-gandhi commited on May 10, 2024

Commit

2f42453

1 Parent(s): 6601d2d

shorten intro

Files changed (1) hide show

app.py CHANGED Viewed

@@ -132,16 +132,14 @@ with gr.Blocks(css=css) as block:
     )
     gr.HTML(
         f"""
-        <p><a href="https://github.com/huggingface/parler-tts"> Parler-TTS</a> is a training and inference library for
-        high-fidelity text-to-speech (TTS) models. The model demonstrated here, <a href="https://huggingface.co/parler-tts/parler_tts_mini_expresso_v0.1"> Parler-TTS Mini: Expresso v0.1</a>,
-        is fine-tuned on the <a href="https://huggingface.co/datasets/ylacombe/expresso"> Expresso dataset</a>.
         It generates high-quality speech in a given <b>emotion</b> and <b>voice</b> that can be controlled through a simple text prompt.</p>
         <p>Tips for ensuring good generation:
         <ul>
             <li>Specify the name of a male speaker (Jerry, Thomas) or female speaker (Talia, Elisabeth) for consistent voices</li>
             <li>The model can generate in a range of emotions, including: "happy", "confused", "default" (meaning no particular emotion conveyed), "laughing", "sad", "whisper", "emphasis"</li>
-            <li>Include the term "high quality audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise</li>
             <li>Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech</li>
             <li>Wrap words in asterisk to emphasise them (e.g. `*Remember*` in the example below)</li>
         </ul>

     )
     gr.HTML(
         f"""
+        <p><a href="https://huggingface.co/parler-tts/parler_tts_mini_expresso_v0.1"> Parler-TTS Mini: Expresso v0.1</a>
+        is a text-to-speech (TTS) model fine-tuned on the <a href="https://huggingface.co/datasets/ylacombe/expresso"> Expresso dataset</a>.
         It generates high-quality speech in a given <b>emotion</b> and <b>voice</b> that can be controlled through a simple text prompt.</p>
         <p>Tips for ensuring good generation:
         <ul>
             <li>Specify the name of a male speaker (Jerry, Thomas) or female speaker (Talia, Elisabeth) for consistent voices</li>
             <li>The model can generate in a range of emotions, including: "happy", "confused", "default" (meaning no particular emotion conveyed), "laughing", "sad", "whisper", "emphasis"</li>
             <li>Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech</li>
             <li>Wrap words in asterisk to emphasise them (e.g. `*Remember*` in the example below)</li>
         </ul>