Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 77
Part123: Part-aware 3D Reconstruction from a Single-view Image Paper • 2405.16888 • Published May 27, 2024 • 12
STT: Stateful Tracking with Transformers for Autonomous Driving Paper • 2405.00236 • Published Apr 30, 2024 • 9
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper • 2404.16820 • Published Apr 25, 2024 • 17
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25, 2024 • 79
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14, 2024 • 127
HyperFields: Towards Zero-Shot Generation of NeRFs from Text Paper • 2310.17075 • Published Oct 26, 2023 • 15
3D-GPT: Procedural 3D Modeling with Large Language Models Paper • 2310.12945 • Published Oct 19, 2023 • 59
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 78