Text-to-Speech
Moshi
English
French
tts
audio
adefossez commited on
Commit
627975d
·
verified ·
1 Parent(s): 172acde

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -60,7 +60,7 @@ See the [GitHub repository](https://github.com/kyutai-labs/delayed-streams-model
60
 
61
  ## Training Details
62
 
63
- The model was trained for 750k steps, with a batch size of 64, and a segment duration of 120 seconds.
64
 
65
  ### Training Data
66
 
@@ -71,7 +71,7 @@ with `whisper-medium`.
71
 
72
  ### Compute Infrastructure
73
 
74
- Pretraining and finetuning was done with 32 H100 Nvidia GPUs.
75
 
76
  ## Model Card Authors
77
 
 
60
 
61
  ## Training Details
62
 
63
+ The model was trained for 750k steps, with a batch size of 64, and a segment duration of 120 seconds. Then, CFG distillation was performed for 24k updates.
64
 
65
  ### Training Data
66
 
 
71
 
72
  ### Compute Infrastructure
73
 
74
+ Pretraining was done with 32 H100 Nvidia GPUs. CFG distillation was done on 8 such GPUs.
75
 
76
  ## Model Card Authors
77