Generate depth map from images
Generate images from text prompts
Generate text descriptions from images