Submitted by akhaliq 36 FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs · 1 authors 1
Submitted by petranokhin 28 AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents · 6 authors 2
Submitted by akhaliq 28 Learning to (Learn at Test Time): RNNs with Expressive Hidden States · 12 authors 2
Submitted by richardxp888 24 RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models · 8 authors 3
Submitted by akhaliq 23 ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild · 6 authors 6
Submitted by BK-Lee 19 Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge · 7 authors 1
Submitted by dongguanting 18 DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning · 6 authors 3
Submitted by akhaliq 16 LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs · 81 authors 1
Submitted by akhaliq 13 Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams · 7 authors 1
Submitted by nonstopfor 10 Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks · 7 authors 1
Submitted by akhaliq 7 CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion Blur Images · 5 authors 1
Submitted by emendes3 4 Granular Privacy Control for Geolocation with Vision Language Models · 6 authors 1