Video-Text-to-Text
Transformers
Safetensors
minicpmv
feature-extraction
MiniCPM-V
finetune
MLLM
custom_code

Add Github link to model card

#1
by nielsr HF staff - opened
Files changed (1) hide show
  1. README.md +7 -10
README.md CHANGED
@@ -1,32 +1,30 @@
1
  ---
 
 
2
  datasets:
3
  - MBZUAI/VideoInstruct-100K
4
  - Share14/ShareGemini
5
  - xjtupanda/T2Vid-Synthetic
6
- base_model:
7
- - openbmb/MiniCPM-Llama3-V-2_5
8
  pipeline_tag: video-text-to-text
9
  tags:
10
  - MiniCPM-V
11
  - finetune
12
  - MLLM
13
- license: apache-2.0
14
- library_name: transformers
15
  ---
16
 
17
-
18
  <h1>Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation</h1>
19
  <p>
20
  πŸ’» <a href="https://github.com/VITA-MLLM/Sparrow">GitHub</a>&nbsp&nbsp | &nbsp&nbsp πŸ“‘ <a href="https://arxiv.org/pdf/2411.19951">Paper</a> &nbsp&nbsp </a>
21
  </p>
22
 
23
-
24
  ## Model Summary
25
 
26
  * This is a part of the project [Sparrow](https://github.com/VITA-MLLM/Sparrow).
27
  * The video-LLM is fine-tuned from the image-LLM [MiniCPM-Llama3-V-2_5](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5).
28
 
29
-
30
  ## License
31
 
32
  #### Model License
@@ -35,13 +33,12 @@ library_name: transformers
35
  * The usage of MiniCPM-V series model weights must strictly follow [MiniCPM Model License.md](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md).
36
  * The models and weights of MiniCPM are completely free for academic research. After filling out a ["questionnaire"](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g) for registration, are also available for free commercial use.
37
 
38
-
39
  #### Statement
40
  * As an LLM, MiniCPM-Llama3-V 2.5 generates contents by learning a large mount of texts, but it cannot comprehend, express personal opinions or make value judgement. Anything generated by MiniCPM-Llama3-V 2.5 does not represent the views and positions of the model developers
41
  * We will not be liable for any problems arising from the use of the MinCPM-V open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
42
 
43
-
44
  ## Training dataset
45
  - 10K video instruction data from Video-ChatGPT
46
  - 10K video caption data from ShareGemini
47
- - 10K synthetic data derived from long text instruction data
 
 
1
  ---
2
+ base_model:
3
+ - openbmb/MiniCPM-Llama3-V-2_5
4
  datasets:
5
  - MBZUAI/VideoInstruct-100K
6
  - Share14/ShareGemini
7
  - xjtupanda/T2Vid-Synthetic
8
+ library_name: transformers
9
+ license: apache-2.0
10
  pipeline_tag: video-text-to-text
11
  tags:
12
  - MiniCPM-V
13
  - finetune
14
  - MLLM
 
 
15
  ---
16
 
17
+ ```markdown
18
  <h1>Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation</h1>
19
  <p>
20
  πŸ’» <a href="https://github.com/VITA-MLLM/Sparrow">GitHub</a>&nbsp&nbsp | &nbsp&nbsp πŸ“‘ <a href="https://arxiv.org/pdf/2411.19951">Paper</a> &nbsp&nbsp </a>
21
  </p>
22
 
 
23
  ## Model Summary
24
 
25
  * This is a part of the project [Sparrow](https://github.com/VITA-MLLM/Sparrow).
26
  * The video-LLM is fine-tuned from the image-LLM [MiniCPM-Llama3-V-2_5](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5).
27
 
 
28
  ## License
29
 
30
  #### Model License
 
33
  * The usage of MiniCPM-V series model weights must strictly follow [MiniCPM Model License.md](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md).
34
  * The models and weights of MiniCPM are completely free for academic research. After filling out a ["questionnaire"](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g) for registration, are also available for free commercial use.
35
 
 
36
  #### Statement
37
  * As an LLM, MiniCPM-Llama3-V 2.5 generates contents by learning a large mount of texts, but it cannot comprehend, express personal opinions or make value judgement. Anything generated by MiniCPM-Llama3-V 2.5 does not represent the views and positions of the model developers
38
  * We will not be liable for any problems arising from the use of the MinCPM-V open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
39
 
 
40
  ## Training dataset
41
  - 10K video instruction data from Video-ChatGPT
42
  - 10K video caption data from ShareGemini
43
+ - 10K synthetic data derived from long text instruction data
44
+ ```