Reward Models Collection Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 3 days ago • 21
ds4sd/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated 8 days ago • 35.3k • 1.56k
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated May 1 • 344k • 1.48k
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 535