-
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Paper • 2507.22607 • Published • 46 -
Adapting Vision-Language Models Without Labels: A Comprehensive Survey
Paper • 2508.05547 • Published • 11 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 27 -
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 26
Collections
Discover the best community collections!
Collections including paper arxiv:2508.14029
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 39 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 13 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 23
-
Large Language Models Can Self-Improve in Long-context Reasoning
Paper • 2411.08147 • Published • 67 -
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 23 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 105
-
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 80 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 119 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 116 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 169
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 5
-
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
Paper • 2507.22607 • Published • 46 -
Adapting Vision-Language Models Without Labels: A Comprehensive Survey
Paper • 2508.05547 • Published • 11 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 27 -
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 26
-
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 80 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 119 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 116 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 169
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 39 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 13 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 23
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 5
-
Large Language Models Can Self-Improve in Long-context Reasoning
Paper • 2411.08147 • Published • 67 -
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 23 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 105