Transcribe audio from microphone, file, or YouTube link
Evaluate open-ended outputs from AI models using MM-Vet