Generate detailed image captions and more
Using VLMs for video captioning
sam2 images and video inference on ZeroGPU
inference for moondream2 point API
Detect objects in images and videos
SOTA real-time object detection model
Identify objects in an image based on text prompts