Submitted by akhaliq 59 Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling · 3 authors 2
Submitted by akhaliq 41 LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing · 5 authors 10
Submitted by akhaliq 26 Controllable Music Production with Diffusion Models and Guidance Gradients · 5 authors 1
Submitted by akhaliq 20 The Generative AI Paradox: "What It Can Create, It May Not Understand" · 14 authors 5
Submitted by akhaliq 11 ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation · 4 authors 1
Submitted by akhaliq 10 AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning · 7 authors 1
Submitted by akhaliq 10 Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? · 5 authors 1