Nishith Jain's picture

Nishith Jain

KingNish

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

updated a Space 23 minutes ago
innova-ai/YuE-music-generator-demo
liked a model about 3 hours ago
black-forest-labs/FLUX.1-schnell-onnx
liked a model about 3 hours ago
retronic/colox-v1
View all activity

Organizations

Wikimedia's profile picture OpenGVLab's profile picture Blog-explorers's profile picture Multi🤖Transformers's profile picture The Collectionists's profile picture HelpingAI's profile picture ZeroGPU Explorers's profile picture Project Fluently's profile picture Poscye's profile picture INNOVA AI's profile picture Narra's profile picture Social Post Explorers's profile picture Cognitive Computations's profile picture Dev Mode Explorers's profile picture Stable Diffusion Community (Unofficial, Non-profit)'s profile picture ONNX Community's profile picture Hugging Face Discord Community's profile picture grafite's profile picture None yet's profile picture Project R's profile picture

KingNish's activity

reacted to retronic's post with 🔥 about 3 hours ago
view post
Post
865
Colox, a reasoning AI model. I am currently working on a model smarter than GPT o1 that thinks before it speaks. It is coming tomorrow in the afternoon.
  • 1 reply
·
updated a Space about 3 hours ago
reacted to hexgrad's post with 👍 about 4 hours ago
view post
Post
3279
I wrote an article about G2P: https://hf.co/blog/hexgrad/g2p

G2P is an underrated piece of small TTS models, like offensive linemen who do a bunch of work and get no credit.

Instead of relying on explicit G2P, larger speech models implicitly learn this task by eating many thousands of hours of audio data. They often use a 500M+ parameter LLM at the front to predict latent audio tokens over a learned codebook, then decode these tokens into audio.

Kokoro instead relies on G2P preprocessing, is 82M parameters, and thus needs less audio to learn. Because of this, we can cherrypick high fidelity audio for training data, and deliver solid speech for those voices. In turn, this excellent audio quality & lack of background noise helps explain why Kokoro is very competitive in single-voice TTS Arenas.
  • 1 reply
·
New activity in fffiloni/YuE about 5 hours ago

Optimized for speed

1
#7 opened about 5 hours ago by
KingNish
upvoted an article 3 days ago
view article
Article

Open-source DeepResearch – Freeing our search agents

701