ASM-Pretrain Model Card

Model details

Model type: ASM is a unified vision-language foundation model for open-world panoptic visual recognition and understanding. Aligning with LLMs, it supports versatile image-text retrieval and generation tasks, demonstrating impressive zero-shot capability.

Model date: ASM was trained in July 2023.

Paper or resources for more information: https://github.com/OpenGVLab/all-seeing

License

ASM is open-sourced under the Apache License 2.0.

Where to send questions or comments about the model: https://github.com/OpenGVLab/all-seeing/issues

Intended use

Primary intended uses: The primary use of ASM is research on large multimodal models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

Training dataset

The pretrain phase employs AS-1B and Laion-COCO.

Evaluation dataset

A collection of 6 benchmarks, including 2 image captioning benchmarks, 2 region captioning benchmarks, and 2 region recognition benchmarks.

Downloads last month
14
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Collection including OpenGVLab/ASM-Pretrain