VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation Paper • 2412.10768 • Published Dec 14, 2024 • 1