The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 26 days ago • 182
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models Paper • 2406.06563 • Published Jun 3, 2024 • 20
Improve Mathematical Reasoning in Language Models by Automated Process Supervision Paper • 2406.06592 • Published Jun 5, 2024 • 29
Simplified and Generalized Masked Diffusion for Discrete Data Paper • 2406.04329 • Published Jun 6, 2024 • 7
Hibou: A Family of Foundational Vision Transformers for Pathology Paper • 2406.05074 • Published Jun 7, 2024 • 9
Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models Paper • 2406.04320 • Published Jun 6, 2024 • 10
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models Paper • 2406.08487 • Published Jun 12, 2024 • 14
Large Language Model Unlearning via Embedding-Corrupted Prompts Paper • 2406.07933 • Published Jun 12, 2024 • 10
Hierarchical Patch Diffusion Models for High-Resolution Video Generation Paper • 2406.07792 • Published Jun 12, 2024 • 16
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation Paper • 2406.07686 • Published Jun 11, 2024 • 17
Discovering Preference Optimization Algorithms with and for Large Language Models Paper • 2406.08414 • Published Jun 12, 2024 • 17
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos Paper • 2406.08407 • Published Jun 12, 2024 • 28
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10, 2024 • 27
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion Paper • 2406.04338 • Published Jun 6, 2024 • 38