This is the official pre-trained model of the paper ''VIRT: Vision Instructed Robotic Transformer for Manipulation Learning''. The model is pre-trained using the robotic imagery pre-training technique on the Droid dataset. If you find this model useful, please cite:

@article{li2024virt,
  title={VIRT: Vision Instructed Robotic Transformer for Manipulation Learning},
  author={Zhuoling, Li and Liangliang, Ren and Jinrong, Yang and Yong, Zhao and others},
  journal={arXiv preprint arXiv:2410.07169},
  year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.