Audio Conditioned LipSync with Latent Diffusion Models
Transcribe or translate audio and YouTube videos