Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 3 days ago • 433
Sa2VA Model Zoo Collection Huggingace Model Zoo For Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos By Bytedance Seed CV Research • 4 items • Updated Feb 9 • 34
Qwen2.5-VL (All Versions) Collection All versions of Qwen2.5-VL including the new 32B version and 4-bit, 16-bit and more! • 12 items • Updated 3 days ago • 12
Stranger Zone Collections [ Org ] Collection Artificial general intelligence • 57 items • Updated Jan 5 • 7
Wavelets Are All You Need for Autoregressive Image Generation Paper • 2406.19997 • Published Jun 28, 2024 • 31
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Paper • 2403.04692 • Published Mar 7, 2024 • 40
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper • 2310.16656 • Published Oct 25, 2023 • 44