Submitted by akhaliq 47 Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models · 8 authors 4
Submitted by akhaliq 26 ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion · 6 authors 4
Submitted by akhaliq 23 BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text · 11 authors 3
Submitted by akhaliq 20 Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction · 7 authors 2
Submitted by akhaliq 12 EgoLifter: Open-world 3D Segmentation for Egocentric Perception · 6 authors 1
Submitted by akhaliq 11 FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing · 4 authors 1
Submitted by akhaliq 6 Towards a World-English Language Model for On-Device Virtual Assistants · 6 authors 1