openbmb
/

MiniCPM-V-4_5

Image-Text-to-Text

feature-extraction

document parsing

Model card Files Files and versions

wangchongyi commited on 5 days ago

Commit

d272cfd

·

1 Parent(s): 73cb2e7

add multi-image infer usage

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -260,6 +260,35 @@ answer = model.chat(
 print(answer)
 ```
 ## License
 #### Model License

 print(answer)
 ```
+#### Chat with multiple images
+<details>
+<summary> Click to show Python code running MiniCPM-V 4.5 with multiple images input. </summary>
+```python
+import torch
+from PIL import Image
+from transformers import AutoModel, AutoTokenizer
+model = AutoModel.from_pretrained('openbmb/MiniCPM-V-4_5', trust_remote_code=True,
+    attn_implementation='sdpa', torch_dtype=torch.bfloat16) # sdpa or flash_attention_2
+model = model.eval().cuda()
+tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-V-4_5', trust_remote_code=True)
+image1 = Image.open('image1.jpg').convert('RGB')
+image2 = Image.open('image2.jpg').convert('RGB')
+question = 'Compare image 1 and image 2, tell me about the differences between image 1 and image 2.'
+msgs = [{'role': 'user', 'content': [image1, image2, question]}]
+answer = model.chat(
+    image=None,
+    msgs=msgs,
+    tokenizer=tokenizer
+)
+print(answer)
+```
+</details>
 ## License
 #### Model License