Image to Video Generation
4M: Massively Multimodal Masked Modeling
High-fidelity Virtual Try-on
Identify objects in images using text queries