interesting
updated
Describe Anything: Detailed Localized Image and Video Captioning
Paper
•
2504.16072
•
Published
•
63
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City
Environment
Paper
•
2410.09604
•
Published
Geospatial Mechanistic Interpretability of Large Language Models
Paper
•
2505.03368
•
Published
•
11
Scenethesis: A Language and Vision Agentic Framework for 3D Scene
Generation
Paper
•
2505.02836
•
Published
•
8
Constructing a 3D Town from a Single Image
Paper
•
2505.15765
•
Published
•
24
SpatialScore: Towards Unified Evaluation for Multimodal Spatial
Understanding
Paper
•
2505.17012
•
Published
•
12
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal
Large Language Models
Paper
•
2505.17015
•
Published
•
9
Visual Embodied Brain: Let Multimodal Large Language Models See, Think,
and Control in Spaces
Paper
•
2506.00123
•
Published
•
35
Point-MoE: Towards Cross-Domain Generalization in 3D Semantic
Segmentation via Mixture-of-Experts
Paper
•
2505.23926
•
Published
•
5
TaskCraft: Automated Generation of Agentic Tasks
Paper
•
2506.10055
•
Published
•
32
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper
•
2506.16406
•
Published
•
130
RLPR: Extrapolating RLVR to General Domains without Verifiers
Paper
•
2506.18254
•
Published
•
31
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
Paper
•
2506.21656
•
Published
•
15
Does Math Reasoning Improve General LLM Capabilities? Understanding
Transferability of LLM Reasoning
Paper
•
2507.00432
•
Published
•
79
Reconstructing 4D Spatial Intelligence: A Survey
Paper
•
2507.21045
•
Published
•
35
Exploitation Is All You Need... for Exploration
Paper
•
2508.01287
•
Published
•
6
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
Paper
•
2507.23478
•
Published
•
15
MolmoAct: Action Reasoning Models that can Reason in Space
Paper
•
2508.07917
•
Published
•
44
Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Paper
•
2508.13142
•
Published
•
34
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper
•
2508.14879
•
Published
•
68
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D
Space
Paper
•
2508.19247
•
Published
•
43
Spacer: Towards Engineered Scientific Inspiration
Paper
•
2508.17661
•
Published
•
32
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
•
2509.02547
•
Published
•
228
Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from
Vector Drawings
Paper
•
2508.18733
•
Published
•
9
Bootstrapping Task Spaces for Self-Improvement
Paper
•
2509.04575
•
Published
•
5
3D Aware Region Prompted Vision Language Model
Paper
•
2509.13317
•
Published
•
14
CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific
Tokenization
Paper
•
2509.21150
•
Published
•
3
Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals
Long-Range Dependency Pitfalls
Paper
•
2510.00184
•
Published
•
16
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement
Learning
Paper
•
2510.03259
•
Published
•
57
Watch and Learn: Learning to Use Computers from Online Videos
Paper
•
2510.04673
•
Published
•
11
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification
Paper
•
2512.10756
•
Published
•
33
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning
Paper
•
2512.15687
•
Published
•
17
When Reasoning Meets Its Laws
Paper
•
2512.17901
•
Published
•
54
Paper
•
2512.16301
•
Published
•
98