Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.12168

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

Paper • 2401.09340 • Published Jan 17, 2024 • 21
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 27
Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding

Paper • 2401.15708 • Published Jan 28, 2024 • 12

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 27

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 31
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Paper • 2312.17172 • Published Dec 28, 2023 • 28
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

Paper • 2401.01974 • Published Jan 3, 2024 • 7
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 28

any size diffusion

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

Paper • 2308.16582 • Published Aug 31, 2023 • 12
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation

Paper • 2310.13119 • Published Oct 19, 2023 • 13
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 32
Text-to-3D with classifier score distillation

Paper • 2310.19415 • Published Oct 30, 2023 • 5

Kosmos-2.5: A Multimodal Literate Model

Paper • 2309.11419 • Published Sep 20, 2023 • 50
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Paper • 2311.05698 • Published Nov 9, 2023 • 14
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 91
PolyMaX: General Dense Prediction with Mask Transformer

Paper • 2311.05770 • Published Nov 9, 2023 • 11

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs