Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
tianchezΒ 
posted an update 9 days ago
Post
3884
Introducing VLM-R1!

GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks?

The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task).

https://github.com/om-ai-lab/VLM-R1

Great job guys, reasoning bringing so many potential!

we also have similiar idea! but only applied for maze

https://huggingface.co/homebrewltd/AlphaMaze-v0.2-1.5B

In this post