fengzi258 commited on
Commit
9c96887
·
verified ·
1 Parent(s): 0fad588

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -8
README.md CHANGED
@@ -789,17 +789,122 @@ We sugguest readers to refer to our [**Github**](https://github.com/baichuan-inc
789
 
790
  <summary>click to view</summary>
791
 
792
- #### Audio Understanding
793
-
794
- </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
795
 
796
- <details>
797
 
798
- <summary>click to view</summary>
799
 
800
- #### Speech Generation
801
 
802
- </details>
803
 
804
  <details>
805
 
@@ -978,7 +1083,7 @@ We recommend interested scholars to visit our github repo for more details. [**G
978
 
979
 
980
  ### License
981
- The community usage of Baichuan-Omni-1.5/Baichuan-Omni-1.5-base requires adherence to [Apache 2.0](https://github.com/baichuan-inc/Baichuan-Omni-1.5/blob/main/LICENSE) and [Community License for Baichuan-Omni-1.5 Models](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/resolve/main/Baichuan%202%E6%A8%A1%E5%9E%8B%E7%A4%BE%E5%8C%BA%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf). The Baichuan-Omni-1.5/Baichuan-Omni-1.5-base models supports commercial use. If you plan to use the Baichuan-Omni-1.5/Baichuan-Omni-1.5-base models or its derivatives for commercial purposes, please ensure that your entity meets the following conditions:
982
 
983
  1. The Daily Active Users (DAU) of your or your affiliate's service or product is less than 1 million.
984
  2. Neither you nor your affiliates are software service providers or cloud service providers.
 
789
 
790
  <summary>click to view</summary>
791
 
792
+ #### Audio Comprehensive and Speech Generation
793
+ <div align="center">
794
+ <table style="margin: 0 auto; text-align: center;">
795
+ <thead>
796
+ <tr>
797
+ <th colspan="12">Audio Comprehensive Capacity</th>
798
+ </tr></thead>
799
+ <tbody>
800
+ <tr>
801
+ <td rowspan="2">Model</td>
802
+ <td rowspan="2">Size</td>
803
+ <td colspan="2">Reasoning QA</td>
804
+ <td colspan="2">Llama Questions</td>
805
+ <td colspan="2">Web Questions</td>
806
+ <td colspan="2">TriviaQA</td>
807
+ <td colspan="2">AlpacaEval</td>
808
+ </tr>
809
+ <tr>
810
+ <td>s→t</td>
811
+ <td>s→s</td>
812
+ <td>s→t</td>
813
+ <td>s→s</td>
814
+ <td>s→t</td>
815
+ <td>s→s</td>
816
+ <td>s→t</td>
817
+ <td>s→s</td>
818
+ <td>s→t</td>
819
+ <td>s→s</td>
820
+ </tr>
821
+ <tr>
822
+ <td colspan="12">Proprietary Models</td>
823
+ </tr>
824
+ <tr>
825
+ <td>GPT-4o-Audio</td>
826
+ <td>-</td>
827
+ <td><b>55.6</td>
828
+ <td>-</td>
829
+ <td><b>88.4</td>
830
+ <td>-</td>
831
+ <td><b>8.10</td>
832
+ <td>-</td>
833
+ <td><b>9.06</td>
834
+ <td>-</td>
835
+ <td><b>8.01</td>
836
+ <td>-</td>
837
+ </tr>
838
+ <tr>
839
+ <td colspan="12">Open-source Models (Pure Audio)</td>
840
+ </tr>
841
+ <tr>
842
+ <td>GLM-4-Voice</td>
843
+ <td>9B</td>
844
+ <td>-</td>
845
+ <td>26.5</td>
846
+ <td>-</td>
847
+ <td>71.0</td>
848
+ <td>-</td>
849
+ <td>5.15</td>
850
+ <td>-</td>
851
+ <td>4.66</td>
852
+ <td>-</td>
853
+ <td>4.89</td>
854
+ </tr>
855
+ <tr>
856
+ <td colspan="12">Open-source Models (Omni-modal)</td>
857
+ </tr>
858
+ <tr>
859
+ <td>VITA-1.5</td>
860
+ <td>7B</td>
861
+ <td>41.0</td>
862
+ <td>-</td>
863
+ <td>74.2</td>
864
+ <td>-</td>
865
+ <td>5.73</td>
866
+ <td>-</td>
867
+ <td>4.68</td>
868
+ <td>-</td>
869
+ <td>6.82</td>
870
+ <td>-</td>
871
+ </tr>
872
+ <tr>
873
+ <td>MiniCPM-o 2.6</td>
874
+ <td>7B</td>
875
+ <td>38.6</td>
876
+ <td>-</td>
877
+ <td>77.8</td>
878
+ <td>-</td>
879
+ <td>6.86</td>
880
+ <td>-</td>
881
+ <td>6.19</td>
882
+ <td>-</td>
883
+ <td>5.18</td>
884
+ <td>-</td>
885
+ </tr>
886
+ <tr>
887
+ <td><b>Baichuan-Omni-1.5</td>
888
+ <td>7B</td>
889
+ <td>50.0</td>
890
+ <td><b>40.9</td>
891
+ <td>78.5</td>
892
+ <td><b>75.3</td>
893
+ <td>5.91</td>
894
+ <td><b>5.52</td>
895
+ <td>5.72</td>
896
+ <td>5.31</td>
897
+ <td>7.79</td>
898
+ <td><b>6.94</td>
899
+ </tr>
900
+ </tbody>
901
+ </table>
902
+ </div>
903
 
 
904
 
905
+ </details>
906
 
 
907
 
 
908
 
909
  <details>
910
 
 
1083
 
1084
 
1085
  ### License
1086
+ The community usage of Baichuan-Omni-1.5/Baichuan-Omni-1.5-base requires adherence to [Apache 2.0](https://github.com/baichuan-inc/Baichuan-Omni-1.5/blob/main/LICENSE) and [Community License for Baichuan-Omni-1.5 Models](https://github.com/baichuan-inc/Baichuan-Omni-1.5/blob/main/LICENSE). The Baichuan-Omni-1.5/Baichuan-Omni-1.5-base models supports commercial use. If you plan to use the Baichuan-Omni-1.5/Baichuan-Omni-1.5-base models or its derivatives for commercial purposes, please ensure that your entity meets the following conditions:
1087
 
1088
  1. The Daily Active Users (DAU) of your or your affiliate's service or product is less than 1 million.
1089
  2. Neither you nor your affiliates are software service providers or cloud service providers.