view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 1 day ago • 19
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 27 items • Updated about 16 hours ago • 112
Qwen2.5 Collection The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 4 days ago • 287
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 73
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 1 day ago • 26
view article Article Yay! Organizations can now publish blog Articles By huggingface • 10 days ago • 30
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 15 days ago • 61
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI • 16 days ago • 40
view article Article Announcing NVIDIA Cosmos World Foundation Models By mingyuliutw • 24 days ago • 23