-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 80 -
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models
Paper • 2403.13447 • Published • 18 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 62
Collections
Discover the best community collections!
Collections including paper arxiv:2402.19479
-
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Paper • 2401.04468 • Published • 49 -
Anything in Any Scene: Photorealistic Video Object Insertion
Paper • 2401.17509 • Published • 17 -
Memory Consolidation Enables Long-Context Video Understanding
Paper • 2402.05861 • Published • 9 -
Magic-Me: Identity-Specific Video Customized Diffusion
Paper • 2402.09368 • Published • 28
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 42 -
AToM: Amortized Text-to-Mesh using 2D Diffusion
Paper • 2402.00867 • Published • 11 -
Neural Network Diffusion
Paper • 2402.13144 • Published • 95 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 33