Update README.md
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ The model can synthesize speech up to **90 minutes** long with up to **4 distinc
|
|
26 |
<img src="figures/Fig1.png" alt="VibeVoice Overview" height="250px">
|
27 |
</p>
|
28 |
|
29 |
-
## Training
|
30 |
Transformer-based Large Language Model (LLM) integrated with specialized acoustic and semantic tokenizers and a diffusion-based decoding head.
|
31 |
- LLM: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) for this release.
|
32 |
- Tokenizers:
|
|
|
26 |
<img src="figures/Fig1.png" alt="VibeVoice Overview" height="250px">
|
27 |
</p>
|
28 |
|
29 |
+
## Training Details
|
30 |
Transformer-based Large Language Model (LLM) integrated with specialized acoustic and semantic tokenizers and a diffusion-based decoding head.
|
31 |
- LLM: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) for this release.
|
32 |
- Tokenizers:
|