di-zhang-fdu
/

Qwen2.5-VL-7B-R1-Distillation

Model card Files Files and versions Community

di-zhang-fdu commited on 26 days ago

Commit

d209e1c

·

verified ·

1 Parent(s): 6cc3c46

Update README.md

Files changed (1) hide show

README.md +21 -1

README.md CHANGED Viewed

@@ -1,5 +1,6 @@
-A Large Multimodal Reasoning Model.
 ```
 from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
 from qwen_vl_utils import process_vision_info
@@ -87,4 +88,23 @@ output_text = processor.batch_decode(
     generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
 )
 print(output_text)
 ```

+# A Large Multimodal Reasoning Model.
+## Usage
 ```
 from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
 from qwen_vl_utils import process_vision_info
     generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
 )
 print(output_text)
+```
+## Citations
+```
+@misc {di_zhang_2025,
+	author       = { {Di Zhang} },
+	title        = { Qwen2.5-VL-7B-R1-Distillation (Revision 6cc3c46) },
+	year         = 2025,
+	url          = { https://huggingface.co/di-zhang-fdu/Qwen2.5-VL-7B-R1-Distillation },
+	doi          = { 10.57967/hf/4710 },
+	publisher    = { Hugging Face }
+}
+@article{zhang2024critic,
+  title={Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning},
+  author={Zhang, Di and Lei, Jingdi and Li, Junxian and Wang, Xunzhi and Liu, Yujie and Yang, Zonglin and Li, Jiatong and Wang, Weida and Yang, Suorong and Wu, Jianbo and others},
+  journal={arXiv preprint arXiv:2411.18203},
+  year={2024}
+}
 ```