Update README.md
Browse files
README.md
CHANGED
@@ -45,7 +45,7 @@ Baichuan-Omni-1.5 has an average score of 73.3 on the OpenCompass list (comprehe
|
|
45 |
Baichuan-Omni-1.5 achieved the best performance on GMAI-MMBench and Openmm-Medical. Using only 7B LLM, the average score exceeded Qwen2-VL-72b by 3%, i.e. 80.7% v.s 83.8%.
|
46 |
|
47 |
- 🎙 **Excellent Voice Capabilities.**
|
48 |
-
Baichuan-Omni-1.5 **supports high-quality, controllable voice bilingual real-time conversations in Chinese and English**. It **outperforms GPT-4o-realtime** in speech understanding tasks (such as ASR and STT, etc.), and demonstrates **the highest speech generation performance among open source models** in semantic and acoustic evaluation of voice conversations.
|
49 |
|
50 |
- 🎬 **Powerful Real-world Understanding and Other Features.**
|
51 |
Baichuan-Omni-1.5 further optimizes the many visual understanding capabilities of Baichuan-omni. It can process images of any aspect ratio and up to 1.8 million pixels (such as 1344x1344). It scored 68.8 points on RealWorldQA, **surpassing commercial closed-source models such as GPT-4o-mini** and recently open-sourced omnimodal models. It scored 85.6/83.6 on the English/Chinese evaluation subsets of MMBench, respectively, which is also in the first echelon of models with the same size.
|
|
|
45 |
Baichuan-Omni-1.5 achieved the best performance on GMAI-MMBench and Openmm-Medical. Using only 7B LLM, the average score exceeded Qwen2-VL-72b by 3%, i.e. 80.7% v.s 83.8%.
|
46 |
|
47 |
- 🎙 **Excellent Voice Capabilities.**
|
48 |
+
Baichuan-Omni-1.5 **supports high-quality, controllable voice bilingual real-time conversations in Chinese and English**. It **outperforms GPT-4o-realtime** in speech understanding tasks (such as ASR and STT, etc.), and demonstrates **the highest speech generation performance among open source models** in semantic and acoustic evaluation of voice conversations.
|
49 |
|
50 |
- 🎬 **Powerful Real-world Understanding and Other Features.**
|
51 |
Baichuan-Omni-1.5 further optimizes the many visual understanding capabilities of Baichuan-omni. It can process images of any aspect ratio and up to 1.8 million pixels (such as 1344x1344). It scored 68.8 points on RealWorldQA, **surpassing commercial closed-source models such as GPT-4o-mini** and recently open-sourced omnimodal models. It scored 85.6/83.6 on the English/Chinese evaluation subsets of MMBench, respectively, which is also in the first echelon of models with the same size.
|