Collections
Discover the best community collections!
Collections including paper arxiv:2309.11499
-
DreamLLM: Synergistic Multimodal Comprehension and Creation
Paper • 2309.11499 • Published • 58 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 88 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 131 -
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Paper • 2405.08344 • Published • 15
-
DreamLLM: Synergistic Multimodal Comprehension and Creation
Paper • 2309.11499 • Published • 58 -
FoleyGen: Visually-Guided Audio Generation
Paper • 2309.10537 • Published • 9 -
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper • 2310.11441 • Published • 28 -
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper • 2311.10093 • Published • 58
-
Compositional Foundation Models for Hierarchical Planning
Paper • 2309.08587 • Published • 11 -
DreamLLM: Synergistic Multimodal Comprehension and Creation
Paper • 2309.11499 • Published • 58 -
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Paper • 2309.15091 • Published • 33 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 8 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 27 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 53 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 50 -
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
Paper • 2309.06380 • Published • 32 -
ImageBind-LLM: Multi-modality Instruction Tuning
Paper • 2309.03905 • Published • 17 -
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Paper • 2309.06933 • Published • 13
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 23 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 17 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 10 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 12