JasperHaozhe commited on
Commit
320acaf
·
verified ·
1 Parent(s): 35da444

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,6 +12,8 @@ pipeline_tag: visual-question-answering
12
 
13
  # VL-Rethinker-72B
14
 
 
 
15
  **VL-Rethinker-72B** achieves SoTA results on various multimodal reasoning benchmarks.
16
 
17
  It is trained using the **Forced Rethinking** technique, on top of [**VL-Reasoner**](https://huggingface.co/TIGER-Lab/VL-Reasoner-72B/) with **GRPO-SSR** training.
@@ -27,7 +29,7 @@ Explore further via the following links:
27
  ## Citation
28
 
29
  If you feel this model useful, please give us a free cite:
30
- ```
31
  @article{vl-rethinker,
32
  title={VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning},
33
  author = {Wang, Haozhe and Qu, Chao and Huang, Zuming and Chu, Wei and Lin, Fangzhen and Chen, Wenhu},
 
12
 
13
  # VL-Rethinker-72B
14
 
15
+ **🚀 News:** <u>We release our meticulously curated collection of RL training queries for multimodal reasoning: [ViRL39K](https://huggingface.co/datasets/TIGER-Lab/ViRL39K).</u>
16
+
17
  **VL-Rethinker-72B** achieves SoTA results on various multimodal reasoning benchmarks.
18
 
19
  It is trained using the **Forced Rethinking** technique, on top of [**VL-Reasoner**](https://huggingface.co/TIGER-Lab/VL-Reasoner-72B/) with **GRPO-SSR** training.
 
29
  ## Citation
30
 
31
  If you feel this model useful, please give us a free cite:
32
+ ```bibtex
33
  @article{vl-rethinker,
34
  title={VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning},
35
  author = {Wang, Haozhe and Qu, Chao and Huang, Zuming and Chu, Wei and Lin, Fangzhen and Chen, Wenhu},