UnifiedReward Models - a CodeGoat24 Collection

CodeGoat24 's Collections

Pref-GRPO & UniGenBench

UnifiedReward Models

UnifiedReward Models GGUF

UnifiedReward Training Data

UnifiedReward Models

updated 1 day ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 124
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Paper • 2505.03318 • Published May 6 • 94
CodeGoat24/UnifiedReward-qwen-32b

33B • Updated 4 days ago • 1.33k • 1
CodeGoat24/UnifiedReward-qwen-7b

8B • Updated 4 days ago • 2.78k • 5

Note Built upon Qwen2.5-VL-Instruct, which is recommended!
CodeGoat24/UnifiedReward-7b-v1.5

8B • Updated 4 days ago • 1.77k • 6
CodeGoat24/UnifiedReward-qwen-3b

4B • Updated 4 days ago • 2.79k • 1
CodeGoat24/UnifiedReward-Think-qwen-7b

8B • Updated 4 days ago • 1.64k • 3
CodeGoat24/UnifiedReward-7b

8B • Updated 4 days ago • 460 • 6
CodeGoat24/UnifiedReward-Think-7b

8B • Updated 4 days ago • 476 • 10
CodeGoat24/UnifiedReward-0.5b

1B • Updated 4 days ago • 42 • 1
CodeGoat24/llava-onevision-qwen2-7b-ov-unifiedreward-dpo

8B • Updated 4 days ago • 6
CodeGoat24/sdxl-turbo-unified-reward-dpo

Text-to-Image • Updated 4 days ago • 10 • 1
CodeGoat24/LLaVA-Video-7B-Qwen2-UnifiedReward-DPO

8B • Updated 4 days ago • 47
CodeGoat24/T2V-Turbo

Updated 4 days ago