DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper • 2507.11527 • Published Jul 15 • 31
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16 • 26
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published Jul 13 • 80
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16 • 41
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning Paper • 2507.14137 • Published Jul 18 • 34
A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models Paper • 2507.13563 • Published Jul 17 • 51
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs Paper • 2507.11097 • Published Jul 15 • 63
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19 • 128
Task-Specific Zero-shot Quantization-Aware Training for Object Detection Paper • 2507.16782 • Published Jul 22 • 9
Iwin Transformer: Hierarchical Vision Transformer using Interleaved Windows Paper • 2507.18405 • Published Jul 24 • 4
DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts Paper • 2507.18464 • Published Jul 24 • 11
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization Paper • 2507.15758 • Published Jul 21 • 34
Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning Paper • 2507.21049 • Published Jul 28 • 41