YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper ⢠2503.08638 ⢠Published 23 days ago ⢠60
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper ⢠2502.18137 ⢠Published Feb 25 ⢠54
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper ⢠2412.15322 ⢠Published Dec 19, 2024 ⢠18
BhasaAnuvaad Collection A Speech Translation Dataset for 13 Indian Languages ⢠11 items ⢠Updated Jan 16 ⢠16
SongCreator: Lyrics-based Universal Song Generation Paper ⢠2409.06029 ⢠Published Sep 9, 2024 ⢠22