new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Aug 2

Submitted by

akhaliq

SAM 2: Segment Anything in Images and Videos

·
18 authors

Submitted by

osanseviero

Gemma 2: Improving Open Language Models at a Practical Size

·
196 authors

Submitted by

akhaliq

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

·
4 authors

Submitted by

akhaliq

OmniParser for Pure Vision Based GUI Agent

·
4 authors

Submitted by

akhaliq

Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning

·
3 authors

Submitted by

akhaliq

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model

·
7 authors

Submitted by

akhaliq

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

·
5 authors

Submitted by

giulio98

Finch: Prompt-guided Key-Value Cache Compression

·
2 authors

Submitted by

whyu

MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities

·
10 authors

Submitted by

manuelkansy

Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion

·
5 authors

Submitted by

akhaliq

UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model

·
5 authors

Submitted by

akhaliq

Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names

·
3 authors

Submitted by

susunghong

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

·
1 authors

Submitted by

gsarti

Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses

·
4 authors

Submitted by

AtsuMiyai

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

·
13 authors

Submitted by

akhaliq

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation

·
7 authors

Submitted by

Omartificial-Intelligence-Space

Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning

·
2 authors