Ivan Tang's picture

3 10 7

Ivan Tang

IvanTang

·

Ivan_Tang_3D

AI & ML interests

Multimodal,3D,PEFT,LLM&MLLM

Recent Activity

authored a paper 28 days ago

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

liked a model 29 days ago

IvanTang/3DGen-R1

upvoted a paper 29 days ago

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

View all activity

Organizations

None yet

upvoted a paper 29 days ago

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Paper • 2512.10949 • Published 29 days ago • 45

upvoted a paper 3 months ago

Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction

Paper • 2510.04759 • Published Oct 6, 2025 • 9

upvoted a paper 8 months ago

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Paper • 2505.12650 • Published May 19, 2025 • 8

upvoted 3 papers 11 months ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23, 2025 • 43

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Paper • 2501.15830 • Published Jan 27, 2025 • 13

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Paper • 2502.09620 • Published Feb 13, 2025 • 26

upvoted a paper about 1 year ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 24

upvoted 2 papers over 1 year ago

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published Aug 29, 2024 • 28

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 53

upvoted a paper about 2 years ago

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 15