Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17485

Models - Multimodal

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Paper • 2402.12226 • Published Feb 19, 2024 • 43
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

Paper • 2401.11649 • Published Jan 22, 2024 • 3
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

Paper • 2402.15504 • Published Feb 23, 2024 • 22
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 191

metavoiceio/metavoice-1B-v0.1

Text-to-Speech • Updated Apr 3, 2024 • 1.31k • 785
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Paper • 2402.08093 • Published Feb 12, 2024 • 60
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 191
SWivid/F5-TTS

Text-to-Speech • Updated 2 days ago • 936k • 942

Play with AnyDoor to Teleport your Target Objects!

Runtime error

210

210

AnyDoor Online

👁

Teleport target objects to new backgrounds
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 191

Advances in 3D Generation: A Survey

Paper • 2401.17807 • Published Jan 31, 2024 • 19
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

Paper • 2402.08682 • Published Feb 13, 2024 • 14
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Paper • 2402.05054 • Published Feb 7, 2024 • 26
GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting

Paper • 2402.10259 • Published Feb 15, 2024 • 16

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

Paper • 2401.15977 • Published Jan 29, 2024 • 38
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23, 2024 • 85
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Paper • 2307.04725 • Published Jul 10, 2023 • 64
Boximator: Generating Rich and Controllable Motions for Video Synthesis

Paper • 2402.01566 • Published Feb 2, 2024 • 27

3D Avatar Utils

Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance

Paper • 2401.15687 • Published Jan 28, 2024 • 23
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 26
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 29
Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Paper • 2312.13150 • Published Dec 20, 2023 • 16

about 17 hours ago

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 17
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 60
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 74

TRIPS: Trilinear Point Splatting for Real-Time Radiance Field Rendering

Paper • 2401.06003 • Published Jan 11, 2024 • 25
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 60
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 191

MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

Paper • 2401.04468 • Published Jan 9, 2024 • 49
Anything in Any Scene: Photorealistic Video Object Insertion

Paper • 2401.17509 • Published Jan 30, 2024 • 17
Memory Consolidation Enables Long-Context Video Understanding

Paper • 2402.05861 • Published Feb 8, 2024 • 10
Magic-Me: Identity-Specific Video Customized Diffusion

Paper • 2402.09368 • Published Feb 14, 2024 • 29

Running on Zero

408

408

OpenDalle V1.1 GPU Demo

🖼

A demo of OpenDalle V1.1 on a ZERO GPU.
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 191

Previous
1
...
6
7
8
9
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs