fengzi258 commited on
Commit
95b1786
·
verified ·
1 Parent(s): 26ea7ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -44,7 +44,7 @@ Baichuan-Omni-1.5 has an average score of 73.3 on the OpenCompass list (comprehe
44
  Baichuan-Omni-1.5 achieved the best performance on GMAI-MMBench and Openmm-Medical. Using only 7B LLM, the average score exceeded Qwen2-VL-72b by 3%, i.e. 80.7% v.s 83.8%.
45
 
46
  - 🎙 **Excellent Voice Capabilities.**
47
- Baichuan-Omni-1.5 **supports high-quality, controllable voice bilingual real-time conversations in Chinese and English**. It **outperforms GPT-4o-realtime** in speech understanding tasks (such as ASR and STT, etc.), and demonstrates **the highest speech generation performance among open source models** in semantic and acoustic evaluation of voice conversations. It also supports advanced capabilities such as emotion/speech rate/style control, voice cloning, and role-playing.
48
 
49
  - 🎬 **Powerful Real-world Understanding and Other Features.**
50
  Baichuan-Omni-1.5 further optimizes the many visual understanding capabilities of Baichuan-omni. It can process images of any aspect ratio and up to 1.8 million pixels (such as 1344x1344). It scored 68.8 points on RealWorldQA, **surpassing commercial closed-source models such as GPT-4o-mini** and recently open-sourced omnimodal models. It scored 85.6/83.6 on the English/Chinese evaluation subsets of MMBench, respectively, which is also in the first echelon of models with the same size.
 
44
  Baichuan-Omni-1.5 achieved the best performance on GMAI-MMBench and Openmm-Medical. Using only 7B LLM, the average score exceeded Qwen2-VL-72b by 3%, i.e. 80.7% v.s 83.8%.
45
 
46
  - 🎙 **Excellent Voice Capabilities.**
47
+ Baichuan-Omni-1.5 **supports high-quality, controllable voice bilingual real-time conversations in Chinese and English**. It **outperforms GPT-4o-realtime** in speech understanding tasks (such as ASR and STT, etc.), and demonstrates **the highest speech generation performance among open source models** in semantic and acoustic evaluation of voice conversations.
48
 
49
  - 🎬 **Powerful Real-world Understanding and Other Features.**
50
  Baichuan-Omni-1.5 further optimizes the many visual understanding capabilities of Baichuan-omni. It can process images of any aspect ratio and up to 1.8 million pixels (such as 1344x1344). It scored 68.8 points on RealWorldQA, **surpassing commercial closed-source models such as GPT-4o-mini** and recently open-sourced omnimodal models. It scored 85.6/83.6 on the English/Chinese evaluation subsets of MMBench, respectively, which is also in the first echelon of models with the same size.