afrideva

afrideva

AI & ML interests

None yet

Recent Activity

Organizations

Keynote Technology's profile picture ZeroGPU Explorers's profile picture Social Post Explorers's profile picture M4-ai's profile picture TheSyndicateAI's profile picture

afrideva's activity

reacted to hexgrad's post with πŸ‘ about 16 hours ago
view post
Post
3705
I wrote an article about G2P: https://hf.co/blog/hexgrad/g2p

G2P is an underrated piece of small TTS models, like offensive linemen who do a bunch of work and get no credit.

Instead of relying on explicit G2P, larger speech models implicitly learn this task by eating many thousands of hours of audio data. They often use a 500M+ parameter LLM at the front to predict latent audio tokens over a learned codebook, then decode these tokens into audio.

Kokoro instead relies on G2P preprocessing, is 82M parameters, and thus needs less audio to learn. Because of this, we can cherrypick high fidelity audio for training data, and deliver solid speech for those voices. In turn, this excellent audio quality & lack of background noise helps explain why Kokoro is very competitive in single-voice TTS Arenas.
  • 1 reply
Β·
reacted to davanstrien's post with ❀️ 7 days ago
upvoted an article 4 months ago
view article
Article

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

By Pringled and 1 other β€’
β€’ 68
reacted to Tonic's post with πŸ‘€ 5 months ago
view post
Post
2526
πŸ™‹πŸ»β€β™‚οΈhey there folks ,

βœ’οΈInkubaLM has been trained from scratch using 1.9 billion tokens of data for five African languages, along with English and French data, totaling 2.4 billion tokens of data. It is capable of understanding and generating content in five African languages: Swahili, Yoruba, Hausa, isiZulu, and isiXhosa, as well as English and French.

model lelapa/InkubaLM-0.4B
demo Tonic/Inkuba-0.4B