YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published 3 days ago • 54
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Paper • 2503.01710 • Published 11 days ago • 3
ChatMusician: Understanding and Generating Music Intrinsically with LLM Paper • 2402.16153 • Published Feb 25, 2024 • 60
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training Paper • 2306.00107 • Published May 31, 2023 • 4
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model Paper • 2305.06908 • Published May 11, 2023 • 6