Vits paper
Browse files
README.md
CHANGED
@@ -33,6 +33,8 @@ tags:
|
|
33 |
- emotion
|
34 |
- audio
|
35 |
- text-to-speech
|
|
|
|
|
36 |
- tts
|
37 |
pipeline_tag: text-to-speech
|
38 |
---
|
@@ -51,6 +53,7 @@ xVAPitch_5820651 model sample: <audio controls>
|
|
51 |
</audio>
|
52 |
|
53 |
Papers:
|
|
|
54 |
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418
|
55 |
|
56 |
Referenced papers within code:
|
|
|
33 |
- emotion
|
34 |
- audio
|
35 |
- text-to-speech
|
36 |
+
- speech-to-speech
|
37 |
+
- voice conversion
|
38 |
- tts
|
39 |
pipeline_tag: text-to-speech
|
40 |
---
|
|
|
53 |
</audio>
|
54 |
|
55 |
Papers:
|
56 |
+
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech - https://arxiv.org/abs/2106.06103
|
57 |
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418
|
58 |
|
59 |
Referenced papers within code:
|