EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Paper
•
2508.21112
•
Published
•
60
EmbodiedOneVision is a unified framework for multimodal embodied reasoning and robot control, featuring interleaved vision-text-action pretraining.