Update README.md
Browse files
README.md
CHANGED
@@ -80,26 +80,21 @@ JiuZhou outperforms GPT-3.5 in objective tasks:
|
|
80 |
<img src="image/objective_score.png" width="800"/>
|
81 |
<br>
|
82 |
</p>
|
83 |
-
|
84 |
-
JiuZhou also scores higher than ClimateChat across six criteria in subjective tasks:
|
85 |
<p align="center">
|
86 |
<br>
|
87 |
<img src="image/subjective_score.png" width="800"/>
|
88 |
<br>
|
89 |
</p>
|
90 |
-
|
91 |
### General Ability
|
92 |
-
|
93 |
-
We evaluate the performance of Chinese-Mistral-7B using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
|
94 |
Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
|
95 |
<p align="center">
|
96 |
<br>
|
97 |
<img src="image/general_score.png" width="800"/>
|
98 |
<br>
|
99 |
</p>
|
100 |
-
|
101 |
## Model Training Process
|
102 |
-
|
103 |
### Training Corpus
|
104 |
The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
|
105 |
<p align="center">
|
@@ -107,7 +102,6 @@ The corpus consists of 50 million general documents and 3.4 million geoscience-r
|
|
107 |
<img src="image/JiuZhou-Corpus.png" width="800"/>
|
108 |
<br>
|
109 |
</p>
|
110 |
-
|
111 |
### Training Framework
|
112 |
We use the JiuZhou-Framework proposed in this study.
|
113 |
<p align="center">
|
@@ -115,7 +109,6 @@ We use the JiuZhou-Framework proposed in this study.
|
|
115 |
<img src="image/JiuZhou-Framework.png" width="800"/>
|
116 |
<br>
|
117 |
</p>
|
118 |
-
|
119 |
### Two-stage Pre-adaptation Pre-training (TSPT)
|
120 |
TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
|
121 |
The difference between TSPT and single-stage training algorithms:
|
@@ -130,8 +123,6 @@ Comparison of TSPT and one-stage pre-training algorithm performance:
|
|
130 |
<img src="image/TSPT_score.png" width="800"/>
|
131 |
<br>
|
132 |
</p>
|
133 |
-
|
134 |
-
|
135 |
## Model Training Code
|
136 |
We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.
|
137 |
|
|
|
80 |
<img src="image/objective_score.png" width="800"/>
|
81 |
<br>
|
82 |
</p>
|
83 |
+
JiuZhou also scores higher than JiuZhou across six criteria in subjective tasks:
|
|
|
84 |
<p align="center">
|
85 |
<br>
|
86 |
<img src="image/subjective_score.png" width="800"/>
|
87 |
<br>
|
88 |
</p>
|
|
|
89 |
### General Ability
|
90 |
+
We evaluate the performance of JiuZhou using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
|
|
|
91 |
Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
|
92 |
<p align="center">
|
93 |
<br>
|
94 |
<img src="image/general_score.png" width="800"/>
|
95 |
<br>
|
96 |
</p>
|
|
|
97 |
## Model Training Process
|
|
|
98 |
### Training Corpus
|
99 |
The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
|
100 |
<p align="center">
|
|
|
102 |
<img src="image/JiuZhou-Corpus.png" width="800"/>
|
103 |
<br>
|
104 |
</p>
|
|
|
105 |
### Training Framework
|
106 |
We use the JiuZhou-Framework proposed in this study.
|
107 |
<p align="center">
|
|
|
109 |
<img src="image/JiuZhou-Framework.png" width="800"/>
|
110 |
<br>
|
111 |
</p>
|
|
|
112 |
### Two-stage Pre-adaptation Pre-training (TSPT)
|
113 |
TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
|
114 |
The difference between TSPT and single-stage training algorithms:
|
|
|
123 |
<img src="image/TSPT_score.png" width="800"/>
|
124 |
<br>
|
125 |
</p>
|
|
|
|
|
126 |
## Model Training Code
|
127 |
We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.
|
128 |
|