A unified multimodal understanding and generation model.
Next-generation reasoning model that runs locally in-browser
Scalable and Versatile 3D Generation from images
Visual Quality Control for DocVQA
a tiny vision language model