view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • 22 days ago • 68
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation Paper • 2508.09987 • Published 20 days ago • 24
view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • Jul 31 • 63
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Paper • 2507.01953 • Published Jul 2 • 19
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20 • 63
view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • Jun 26 • 115
LettinGo: Explore User Profile Generation for Recommendation System Paper • 2506.18309 • Published Jun 23 • 11
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation Paper • 2506.10540 • Published Jun 12 • 38