Submitted by razzant 192 Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders · 8 authors 2
Submitted by UglyToilet 61 SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models · 10 authors 1
Submitted by FanqingM 50 MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning · 14 authors 2
Submitted by tellarin 35 Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning · 4 authors 2
Submitted by Seanie-lee 26 FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates · 4 authors 1
Submitted by tianyic 25 DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs · 7 authors 2
Submitted by yiren98 20 EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer · 5 authors 2
Submitted by akhaliq 19 Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models · 8 authors 2
Submitted by CharonBony 17 FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation · 9 authors 6
Submitted by BoZhang 15 SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing · 7 authors 2
Submitted by akhaliq 14 AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning · 5 authors 1
Submitted by AQuarterMile 14 WritingBench: A Comprehensive Benchmark for Generative Writing · 11 authors 2
Submitted by RTT1 13 MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning · 12 authors 1
Submitted by giulio98 13 Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning · 4 authors 4
Submitted by gpx333 12 Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment · 7 authors 1
Submitted by lqniu 12 LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning · 5 authors 3
Submitted by TokerZ 11 Agent models: Internalizing Chain-of-Action Generation into Reasoning models · 5 authors 2
Submitted by zszhong 7 Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement · 7 authors 2
Submitted by Llwo 7 This Is Your Doge, If It Please You: Exploring Deception and Robustness in Mixture of LLMs · 3 authors 2
Submitted by ryanchen42 6 Words or Vision: Do Vision-Language Models Have Blind Faith in Text? · 4 authors 2
Submitted by xiaol 5 BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling · 2 authors 2
Submitted by wjkang 5 State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models · 6 authors 2
Submitted by SinclairSchneider 4 Detection Avoidance Techniques for Large Language Models · 4 authors 1
Submitted by msadat97 4 Efficient Distillation of Classifier-Free Guidance using Adapters · 2 authors 1
Submitted by BestWishYsh 4 WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation · 9 authors 1
Submitted by dxli1 3 ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks · 7 authors 2
Submitted by JeongHun0716 3 Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations · 5 authors 2
Submitted by ddgoodgood 2 TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models · 6 authors 1
Submitted by xwen99 2 A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning · 5 authors 2
Submitted by hisoka94 2 Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs · 3 authors 2
Submitted by LorenaYannnnn 2 Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries · 2 authors 4
Submitted by Amoik 1 REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding · 7 authors 1
Submitted by XThomasBU 1 What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization · 2 authors 2
Submitted by MingxingLi 1 Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model · 5 authors 2
Submitted by dinobby 1 Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning · 5 authors 2
Submitted by teelinsan 1 Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces · 8 authors 2
Submitted by raaec 1 PhiloBERTA: A Transformer-Based Cross-Lingual Analysis of Greek and Latin Lexicons · 2 authors 2
Submitted by KianYale 1 NeuGrasp: Generalizable Neural Surface Reconstruction with Background Priors for Material-Agnostic Object Grasp Detection · 8 authors 2
Submitted by mskrt 1 Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts · 9 authors 2