2 8 8

Shivam Mehta

shivammehta25

http://www.shivammehta.me

AI & ML interests

Speech, Audio, LLM, Flow Matching, Diffusion, Flows, HMMs

Recent Activity

liked a Space 6 days ago

kyutai/hibiki-samples

upvoted a paper 5 months ago

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

upvoted a paper 8 months ago

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

View all activity

Organizations

shivammehta25's activity

liked a Space 6 days ago

Hibiki Samples

🤗

upvoted a paper 5 months ago

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published Sep 17, 2024 • 18

upvoted a paper 8 months ago

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Paper • 2406.04904 • Published Jun 7, 2024 • 6

authored a paper 10 months ago

Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis

Paper • 2404.19622 • Published Apr 30, 2024 • 2

upvoted a paper 10 months ago

Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis

Paper • 2404.19622 • Published Apr 30, 2024 • 2

liked a Space 10 months ago

630

TTS Arena

🏆

Vote on the latest TTS models!

liked a Space 11 months ago

EvoVLM JP

🐠

Ask questions about images to get descriptions and answers

upvoted a paper 11 months ago

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Paper • 2403.11781 • Published Mar 18, 2024 • 18

New activity in TTS-AGI/TTS-Arena 12 months ago

How open source architectures compares

#21 opened 12 months ago by

shivammehta25

upvoted a paper about 1 year ago

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 83

liked 2 Spaces about 1 year ago

3.19k

InstantID

😻

Generate personalized images with a face preservation

Matcha TTS

🍵

Generate speech from text input

upvoted a paper about 1 year ago

Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Paper • 2312.03491 • Published Dec 6, 2023 • 34

New activity in shivammehta25/Matcha-TTS about 1 year ago

Apply for community grant: Academic project (gpu)

#1 opened about 1 year ago by

shivammehta25

liked a Space over 1 year ago

2.35k

XTTS

🐸

authored a paper over 1 year ago

Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation

Paper • 2309.05455 • Published Sep 11, 2023

liked a model over 1 year ago

coqui/XTTS-v1

Text-to-Speech • Updated Nov 10, 2023 • 20.7k • 369

liked a Space over 1 year ago

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis

🤖

upvoted a paper over 1 year ago

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis

Paper • 2306.09417 • Published Jun 15, 2023 • 3

authored a paper over 1 year ago

Prosody-controllable spontaneous TTS with neural HMMs

Paper • 2211.13533 • Published Nov 24, 2022