view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 5 days ago • 54
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 28 days ago • 362
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 28 days ago • 100
ViTPose Collection Collection for ViTPose models based on transformers implementation. • 10 items • Updated Jan 12 • 12
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model Paper • 2501.01028 • Published Jan 2 • 13
Agents Collection Collection of resources related to Agents. • 73 items • Updated 27 days ago • 6
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated Jan 23 • 31
Llama 3.3 Collection This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 135
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 142
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated 4 days ago • 34
Arabic Matryoshka Embedding Models Collection A collection of advanced Arabic Matryoshka Embedding Models designed for efficient and high-performance Arabic NLP, available publicly on Hugging Face • 11 items • Updated 12 days ago • 12
GATE: General Arabic Text Embedding Models Collection This Collection includes GATE Models, a new series of trained sentence transformer models trained on multi-task datasets and using different losses. • 2 items • Updated Aug 29, 2024 • 1
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 73