OpenGVLab
/

InternVL

Model card Files Files and versions

czczup commited on Feb 11, 2024

Commit

387c63f

·

verified ·

1 Parent(s): b83d090

Update README.md

Files changed (1) hide show

README.md +1 -4

README.md CHANGED Viewed

@@ -11,15 +11,12 @@ This repository contains the PyTorch version of the InternVL model weights.
 ## What is InternVL?
-\[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\]
 InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
-It is trained using web-scale, noisy image-text pairs. The data are all publicly available and comprise multilingual content, including LAION-en, LAION-multi, LAION-COCO, COYO, Wukong, CC12M, CC3M, and SBU.
 It is _**the largest open-source vision/vision-language foundation model (14B)**_ to date, achieving _**32 state-of-the-art**_ performances on a wide range of tasks such as visual perception, cross-modal retrieval, multimodal dialogue, etc.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/k5UATwX5W2b5KJBN5C58x.png)

 ## What is InternVL?
+\[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\]   \[[Chat Demo](https://internvl.opengvlab.com/)\]
 InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
 It is _**the largest open-source vision/vision-language foundation model (14B)**_ to date, achieving _**32 state-of-the-art**_ performances on a wide range of tasks such as visual perception, cross-modal retrieval, multimodal dialogue, etc.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/k5UATwX5W2b5KJBN5C58x.png)