xjtupanda
/

MiniCPM-V-30K-mix-finetune

@@ -1,32 +1,30 @@
 ---
 datasets:
 - MBZUAI/VideoInstruct-100K
 - Share14/ShareGemini
 - xjtupanda/T2Vid-Synthetic
-base_model:
-- openbmb/MiniCPM-Llama3-V-2_5
 pipeline_tag: video-text-to-text
 tags:
 - MiniCPM-V
 - finetune
 - MLLM
-license: apache-2.0
-library_name: transformers
 ---
 <h1>Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation</h1>
 <p>
         💻 <a href="https://github.com/VITA-MLLM/Sparrow">GitHub</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/pdf/2411.19951">Paper</a> &nbsp&nbsp  </a>
 </p>
 ## Model Summary
 * This is a part of the project [Sparrow](https://github.com/VITA-MLLM/Sparrow).
 * The video-LLM is fine-tuned from the image-LLM [MiniCPM-Llama3-V-2_5](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5).
 ## License
 #### Model License
@@ -35,13 +33,12 @@ library_name: transformers
 * The usage of MiniCPM-V series model weights must strictly follow [MiniCPM Model License.md](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md).
 * The models and weights of MiniCPM are completely free for academic research. After filling out a ["questionnaire"](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g) for registration, are also available for free commercial use.
 #### Statement
 * As an LLM, MiniCPM-Llama3-V 2.5 generates contents by learning a large mount of texts, but it cannot comprehend, express personal opinions or make value judgement. Anything generated by MiniCPM-Llama3-V 2.5 does not represent the views and positions of the model developers
 * We will not be liable for any problems arising from the use of the MinCPM-V open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
 ## Training dataset
 - 10K video instruction data from Video-ChatGPT
 - 10K video caption data from ShareGemini
-- 10K synthetic data derived from long text instruction data

 ---
+base_model:
+- openbmb/MiniCPM-Llama3-V-2_5
 datasets:
 - MBZUAI/VideoInstruct-100K
 - Share14/ShareGemini
 - xjtupanda/T2Vid-Synthetic
+library_name: transformers
+license: apache-2.0
 pipeline_tag: video-text-to-text
 tags:
 - MiniCPM-V
 - finetune
 - MLLM
 ---
+```markdown
 <h1>Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation</h1>
 <p>
         💻 <a href="https://github.com/VITA-MLLM/Sparrow">GitHub</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/pdf/2411.19951">Paper</a> &nbsp&nbsp  </a>
 </p>
 ## Model Summary
 * This is a part of the project [Sparrow](https://github.com/VITA-MLLM/Sparrow).
 * The video-LLM is fine-tuned from the image-LLM [MiniCPM-Llama3-V-2_5](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5).
 ## License
 #### Model License
 * The usage of MiniCPM-V series model weights must strictly follow [MiniCPM Model License.md](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md).
 * The models and weights of MiniCPM are completely free for academic research. After filling out a ["questionnaire"](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g) for registration, are also available for free commercial use.
 #### Statement
 * As an LLM, MiniCPM-Llama3-V 2.5 generates contents by learning a large mount of texts, but it cannot comprehend, express personal opinions or make value judgement. Anything generated by MiniCPM-Llama3-V 2.5 does not represent the views and positions of the model developers
 * We will not be liable for any problems arising from the use of the MinCPM-V open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
 ## Training dataset
 - 10K video instruction data from Video-ChatGPT
 - 10K video caption data from ShareGemini
+- 10K synthetic data derived from long text instruction data
+```