SESAME_minus

  • Model type: SESAME_minus is an open-source multimodal model trained by fine-tuning LLaVA on various instruction-based image grounding (segmentation) data. It is an instruction-baed segmentation model basically, serving as a baseline.
  • Paper or resources for more information: https://see-say-segment.github.io/
  • Where to send questions or comments about the model: https://github.com/see-say-segment/sesame/issues
  • Intended use
    • Primary intended uses: The primary use of SESAME is research on large multimodal models and chatbots.
    • Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
  • Training dataset: RefCOCO(+/g)
Downloads last month
2
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Collection including tsunghanwu/SESAME_minus