Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.11340

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 26
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 43
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

Prompt Expansion

IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

Paper • 2409.08240 • Published Sep 12, 2024 • 22
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Paper • 2410.07133 • Published Oct 9, 2024 • 19
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113

Papers - Image - CoT

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113
LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 120

Running on Zero

180

180

FLUX.1 Dev Inpainting Model Beta GPU

🏆

Replace parts of an image using text prompts
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113

📑Trending Papers - September 9⃣️

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 145
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 89
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4, 2024 • 95
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113

Omni-Generation

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113
Video-Guided Foley Sound Generation with Multimodal Controls

Paper • 2411.17698 • Published Nov 26, 2024 • 9
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Paper • 2412.01064 • Published Dec 2, 2024 • 27
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Paper • 2412.01169 • Published Dec 2, 2024 • 13

Interesting papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113
NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 74

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 26
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113

Diffusion-Papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 113

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs