voidful
/

mhubert-unit-tts

text2text-generation

Model card Files Files and versions

voidful commited on Mar 23, 2023

Commit

c92d3b5

·

1 Parent(s): 58a480e

Create README.md

Files changed (1) hide show

README.md +54 -0

README.md ADDED Viewed

	@@ -0,0 +1,54 @@

+---
+datasets:
+- librispeech_asr
+language:
+- en
+metrics:
+- wer
+tags:
+- hubert
+- tts
+---
+# voidful/mhubert-unit-tts
+voidful/mhubert-unit-tts
+This repository provides a text to unit model form mhubert and trained with bart model.
+The model was trained on the LibriSpeech ASR dataset for the English language and
+Train epoch 13: `WER:30.41` `CER: 20.22`
+Hubert Code TTS Example
+```python
+import asrp
+import nlp2
+import IPython.display as ipd
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+nlp2.download_file(
+    'https://dl.fbaipublicfiles.com/fairseq/speech_to_speech/vocoder/code_hifigan/mhubert_vp_en_es_fr_it3_400k_layer11_km1000_lj/g_00500000',
+    './')
+tokenizer = AutoTokenizer.from_pretrained("voidful/mhubert-unit-tts")
+model = AutoModelForSeq2SeqLM.from_pretrained("voidful/mhubert-unit-tts")
+model.eval()
+cs = asrp.Code2Speech(tts_checkpoint='./g_00500000', vocoder='hifigan')
+inputs = tokenizer(["The quick brown fox jumps over the lazy dog."], return_tensors="pt")
+code = tokenizer.batch_decode(model.generate(**inputs,max_length=1024))[0]
+code = [int(i) for i in code.replace("</s>","").replace("<s>","").split("v_tok_")[1:]]
+print(code)
+ipd.Audio(data=cs(code), autoplay=False, rate=cs.sample_rate)
+```
+Datasets
+The model was trained on the LibriSpeech ASR dataset for the English language.
+Language
+The model is trained for the English language.
+Metrics
+The model's performance is evaluated using Word Error Rate (WER).
+Tags
+The model can be tagged with "hubert" and "tts".