Image-Text-to-Text
Transformers
Safetensors
multilingual
internvl
custom_code
conversational

Add more descriptive tags to InternVL3.5-1B model card

#1
by nielsr HF Staff - opened

This pull request aims to improve the discoverability and completeness of the InternVL3.5-1B model card. Based on the paper abstract and the model's described capabilities, I've added the following relevant tags to the metadata:

  • multimodal: The model is explicitly described as an "open-source multimodal model."
  • reasoning: The paper highlights "reasoning capability" as a significant advancement.
  • agent: The model supports "GUI interaction and embodied agency," aligning with agentic tasks.
  • llm: The model is an "open-source MLLM" (multimodal large language model).
  • efficiency: The paper emphasizes "inference efficiency" as a key improvement.

All other existing information, including paper link, GitHub link, project page, and sample usage, is already well-documented and accurate.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment