VQ-Logits: Compressing the Output Bottleneck of Large Language Models via Vector Quantized Logits Paper • 2505.10202 • Published May 15
Power-Law Decay Loss for Large Language Model Finetuning: A Theory Perspective Paper • 2505.16900 • Published May 22
ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention Paper • 2505.10222 • Published May 15
Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective Paper • 2505.17997 • Published May 23
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published May 25 • 24
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction Paper • 2502.17239 • Published Feb 24 • 3
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies Paper • 2503.14324 • Published Mar 18 • 2
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published Mar 25 • 19
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis Paper • 2503.22420 • Published Mar 28
Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning Paper • 2410.12952 • Published Oct 16, 2024