Generate text from an image and question
Describe images and extract text with Florence-2
Identify named entities in text