Stack-and-Delay: a new codebook pattern for music generation Paper • 2309.08804 • Published Sep 15, 2023 • 4
Enhance audio generation controllability through representation similarity regularization Paper • 2309.08773 • Published Sep 15, 2023 • 3
PDFTriage: Question Answering over Long, Structured Documents Paper • 2309.08872 • Published Sep 16, 2023 • 53
End-to-End Speech Recognition Contextualization with Large Language Models Paper • 2309.10917 • Published Sep 19, 2023 • 9
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model Paper • 2309.13018 • Published Sep 22, 2023 • 9
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation Paper • 2309.16429 • Published Sep 28, 2023 • 11
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Paper • 2310.00704 • Published Oct 1, 2023 • 21