Aditya Kumar Singh's picture

4 6 2

Aditya Kumar Singh

rodo

·

http://rodosingh.github.io/

AI & ML interests

Multimodal Learning

Organizations

upvoted 5 papers 10 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14, 2025 • 21

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17, 2025 • 17

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17, 2025 • 30

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 170

upvoted a collection about 1 year ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated 3 days ago • 350