Reward Inside the Model: A Lightweight Hidden-State Reward Model for LLM's Best-of-N sampling Paper • 2505.12225 • Published May 18, 2025 • 8
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published 15 days ago • 111
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 15 days ago • 135
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published Jul 13, 2025 • 86
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent Paper • 2411.02937 • Published Nov 5, 2024 • 2