Submitted by osanseviero 76 Gemma 2: Improving Open Language Models at a Practical Size · 196 authors 3
Submitted by akhaliq 29 SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement · 4 authors 2
Submitted by akhaliq 24 Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning · 3 authors 6
Submitted by akhaliq 22 Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model · 7 authors 2
Submitted by akhaliq 16 TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models · 5 authors 2
Submitted by whyu 13 MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities · 10 authors 9
Submitted by manuelkansy 11 Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion · 5 authors 2
Submitted by akhaliq 10 UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model · 5 authors 2
Submitted by akhaliq 10 Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names · 3 authors 2
Submitted by susunghong 6 Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention · 1 authors 2
Submitted by gsarti 6 Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses · 4 authors 2
Submitted by AtsuMiyai 5 Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey · 13 authors 2
Submitted by akhaliq 5 Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation · 7 authors 2
Submitted by Omartificial-Intelligence-Space 3 Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning · 2 authors 2