Update: README

Browse files

Files changed (1) hide show

README.md +123 -3

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ tags:
 <h1>A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone</h1>
-[GitHub](https://github.com/OpenBMB/MiniCPM-o) | [Demo](http://101.126.42.235:30910/)</a>
@@ -135,7 +135,7 @@ MiniCPM-V 4.5 can be easily used in various ways: (1) [llama.cpp](https://github
 </table>
 </div>
-Both Video-MME and OpenCompass were evaluated using 8×A100 GPUs for inference. The reported inference time of Video-MME excludes the cost of video frame extraction.
 ### Examples
@@ -161,6 +161,91 @@ We deploy MiniCPM-V 4.5 on iPad M4 with [iOS demo](https://github.com/tc-mb/Mini
   <img src="https://raw.githubusercontent.com/openbmb/MiniCPM-o/main/assets/minicpmv4_5/v45_cn_travel.gif" width="45%" style="display: inline-block; margin: 0 10px;"/>
 </div>
 ## Usage
@@ -358,7 +443,42 @@ question = 'Compare image 1 and image 2, tell me about the differences between i
 msgs = [{'role': 'user', 'content': [image1, image2, question]}]
 answer = model.chat(
-    image=None,
     msgs=msgs,
     tokenizer=tokenizer
 )

 <h1>A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone</h1>
+[GitHub](https://github.com/OpenBMB/MiniCPM-o) | [CookBook](https://github.com/OpenSQZ/MiniCPM-V-CookBook) | [Demo](http://101.126.42.235:30910/)</a>
 </table>
 </div>
+Both Video-MME and OpenCompass were evaluated using 8×A100 GPUs for inference. The reported inference time of Video-MME includes full model-side computation, and excludes the external cost of video frame extraction (dependent on specific frame extraction tools) for fair comparison.
 ### Examples
   <img src="https://raw.githubusercontent.com/openbmb/MiniCPM-o/main/assets/minicpmv4_5/v45_cn_travel.gif" width="45%" style="display: inline-block; margin: 0 10px;"/>
 </div>
+## Framework Support Matrix
+<table>
+  <thead>
+    <tr>
+      <th>Category</th>
+      <th>Framework</th>
+      <th>Cookbook Link</th>
+      <th>Upstream PR</th>
+      <th>Supported since (branch)</th>
+      <th>Supported since (release)</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td rowspan="2">Edge (On-device)</td>
+      <td>Llama.cpp</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/llama.cpp/minicpm-v4_5_llamacpp.md">Llama.cpp Doc</a></td>
+      <td><a href="https://github.com/ggml-org/llama.cpp/pull/15575">#15575</a> (2025-08-26)</td>
+      <td>master (2025-08-26)</td>
+      <td><a href="https://github.com/ggml-org/llama.cpp/releases/tag/b6282">b6282</a></td>
+    </tr>
+    <tr>
+      <td>Ollama</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/ollama/minicpm-v4_5_ollama.md">Ollama Doc</a></td>
+      <td><a href="https://github.com/ollama/ollama/pull/12078">#12078</a> (2025-08-26)</td>
+      <td>Merging</td>
+      <td>Waiting for official release</td>
+    </tr>
+    <tr>
+      <td rowspan="2">Serving (Cloud)</td>
+      <td>vLLM</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/vllm/minicpm-v4_5_vllm.md">vLLM Doc</a></td>
+      <td><a href="https://github.com/vllm-project/vllm/pull/23586">#23586</a> (2025-08-26)</td>
+      <td>main (2025-08-27)</td>
+      <td>Waiting for official release</td>
+    </tr>
+    <tr>
+      <td>SGLang</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/sglang/MiniCPM-v4_5_sglang.md">SGLang Doc</a></td>
+      <td><a href="https://github.com/sgl-project/sglang/pull/9610">#9610</a> (2025-08-26)</td>
+      <td>Merging</td>
+      <td>Waiting for official release</td>
+    </tr>
+    <tr>
+      <td>Finetuning</td>
+      <td>LLaMA-Factory</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/finetune/finetune_llamafactory.md">LLaMA-Factory Doc</a></td>
+      <td><a href="https://github.com/hiyouga/LLaMA-Factory/pull/9022">#9022</a> (2025-08-26)</td>
+      <td>main (2025-08-26)</td>
+      <td>Waiting for official release</td>
+    </tr>
+    <tr>
+      <td rowspan="3">Quantization</td>
+      <td>GGUF</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/quantization/gguf/minicpm-v4_5_gguf_quantize.md">GGUF Doc</a></td>
+      <td>—</td>
+      <td>—</td>
+      <td>—</td>
+    </tr>
+    <tr>
+      <td>BNB</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/quantization/bnb/minicpm-v4_5_bnb_quantize.md">BNB Doc</a></td>
+      <td>—</td>
+      <td>—</td>
+      <td>—</td>
+    </tr>
+    <tr>
+      <td>AWQ</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/quantization/awq/minicpm-v4_5_awq_quantize.md">AWQ Doc</a></td>
+      <td>—</td>
+      <td>—</td>
+      <td>—</td>
+    </tr>
+    <tr>
+      <td>Demos</td>
+      <td>Gradio Demo</td>
+      <td><a href="https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/demo/web_demo/gradio/README.md">Gradio Demo Doc</a></td>
+      <td>—</td>
+      <td>—</td>
+      <td>—</td>
+    </tr>
+  </tbody>
+ </table>
+> Note: If you'd like us to prioritize support for another open-source framework, please let us know via this [short form](https://docs.google.com/forms/d/e/1FAIpQLSdyTUrOPBgWqPexs3ORrg47ZcZ1r4vFQaA4ve2iA7L9sMfMWw/viewform).
 ## Usage
 msgs = [{'role': 'user', 'content': [image1, image2, question]}]
 answer = model.chat(
+    msgs=msgs,
+    tokenizer=tokenizer
+)
+print(answer)
+```
+</details>
+#### In-context few-shot learning
+<details>
+<summary> Click to view Python code running MiniCPM-V 4.5 with few-shot input. </summary>
+```python
+import torch
+from PIL import Image
+from transformers import AutoModel, AutoTokenizer
+model = AutoModel.from_pretrained('openbmb/MiniCPM-V-4_5', trust_remote_code=True,
+    attn_implementation='sdpa', torch_dtype=torch.bfloat16)
+model = model.eval().cuda()
+tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-V-4_5', trust_remote_code=True)
+question = "production date"
+image1 = Image.open('example1.jpg').convert('RGB')
+answer1 = "2023.08.04"
+image2 = Image.open('example2.jpg').convert('RGB')
+answer2 = "2007.04.24"
+image_test = Image.open('test.jpg').convert('RGB')
+msgs = [
+    {'role': 'user', 'content': [image1, question]}, {'role': 'assistant', 'content': [answer1]},
+    {'role': 'user', 'content': [image2, question]}, {'role': 'assistant', 'content': [answer2]},
+    {'role': 'user', 'content': [image_test, question]}
+]
+answer = model.chat(
     msgs=msgs,
     tokenizer=tokenizer
 )