microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 6 days ago • 1.03M • 1.26k
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 146