EricWesthoff commited on
Commit
d61ba60
·
1 Parent(s): 1ad9761

End of training

Browse files
Files changed (1) hide show
  1. README.md +65 -5
README.md CHANGED
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 1.4066
19
 
20
  ## Model description
21
 
@@ -34,13 +34,13 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - learning_rate: 0.0006
38
- - train_batch_size: 4
39
- - eval_batch_size: 4
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
- - training_steps: 12000
44
 
45
  ### Training results
46
 
@@ -166,6 +166,66 @@ The following hyperparameters were used during training:
166
  | 1.4165 | 4.72 | 11800 | 1.4070 |
167
  | 1.4103 | 4.76 | 11900 | 1.4067 |
168
  | 1.4214 | 4.8 | 12000 | 1.4066 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
 
170
 
171
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 1.2770
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - learning_rate: 0.0002
38
+ - train_batch_size: 8
39
+ - eval_batch_size: 8
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
+ - training_steps: 18000
44
 
45
  ### Training results
46
 
 
166
  | 1.4165 | 4.72 | 11800 | 1.4070 |
167
  | 1.4103 | 4.76 | 11900 | 1.4067 |
168
  | 1.4214 | 4.8 | 12000 | 1.4066 |
169
+ | 1.4223 | 9.68 | 12100 | 1.4162 |
170
+ | 1.4471 | 9.76 | 12200 | 1.4210 |
171
+ | 1.4165 | 9.84 | 12300 | 1.4154 |
172
+ | 1.4088 | 9.92 | 12400 | 1.4105 |
173
+ | 1.4057 | 10.0 | 12500 | 1.4100 |
174
+ | 1.3778 | 10.08 | 12600 | 1.4034 |
175
+ | 1.4081 | 10.16 | 12700 | 1.4055 |
176
+ | 1.4127 | 10.24 | 12800 | 1.4001 |
177
+ | 1.4282 | 10.32 | 12900 | 1.3924 |
178
+ | 1.4069 | 10.4 | 13000 | 1.3909 |
179
+ | 1.4097 | 10.48 | 13100 | 1.3885 |
180
+ | 1.4173 | 10.56 | 13200 | 1.3824 |
181
+ | 1.4282 | 10.64 | 13300 | 1.3798 |
182
+ | 1.4266 | 10.72 | 13400 | 1.3778 |
183
+ | 1.4205 | 10.8 | 13500 | 1.3760 |
184
+ | 1.4347 | 10.88 | 13600 | 1.3730 |
185
+ | 1.4088 | 10.96 | 13700 | 1.3659 |
186
+ | 1.3859 | 11.04 | 13800 | 1.3605 |
187
+ | 1.3711 | 11.12 | 13900 | 1.3572 |
188
+ | 1.3896 | 11.2 | 14000 | 1.3550 |
189
+ | 1.343 | 11.28 | 14100 | 1.3510 |
190
+ | 1.3866 | 11.36 | 14200 | 1.3485 |
191
+ | 1.3603 | 11.44 | 14300 | 1.3468 |
192
+ | 1.3881 | 11.52 | 14400 | 1.3448 |
193
+ | 1.3841 | 11.6 | 14500 | 1.3422 |
194
+ | 1.358 | 11.68 | 14600 | 1.3379 |
195
+ | 1.3704 | 11.76 | 14700 | 1.3352 |
196
+ | 1.3656 | 11.84 | 14800 | 1.3350 |
197
+ | 1.367 | 11.92 | 14900 | 1.3299 |
198
+ | 1.3765 | 12.0 | 15000 | 1.3302 |
199
+ | 1.32 | 12.08 | 15100 | 1.3240 |
200
+ | 1.343 | 12.16 | 15200 | 1.3186 |
201
+ | 1.3254 | 12.24 | 15300 | 1.3159 |
202
+ | 1.3433 | 12.32 | 15400 | 1.3134 |
203
+ | 1.3347 | 12.4 | 15500 | 1.3113 |
204
+ | 1.3304 | 12.48 | 15600 | 1.3110 |
205
+ | 1.3235 | 12.56 | 15700 | 1.3106 |
206
+ | 1.3099 | 12.64 | 15800 | 1.3056 |
207
+ | 1.3176 | 12.72 | 15900 | 1.3027 |
208
+ | 1.3613 | 12.8 | 16000 | 1.3057 |
209
+ | 1.3238 | 12.88 | 16100 | 1.3006 |
210
+ | 1.354 | 12.96 | 16200 | 1.3003 |
211
+ | 1.3324 | 13.04 | 16300 | 1.2967 |
212
+ | 1.322 | 13.12 | 16400 | 1.2945 |
213
+ | 1.3029 | 13.2 | 16500 | 1.2898 |
214
+ | 1.317 | 13.28 | 16600 | 1.2892 |
215
+ | 1.2982 | 13.36 | 16700 | 1.2882 |
216
+ | 1.3092 | 13.44 | 16800 | 1.2878 |
217
+ | 1.3161 | 13.52 | 16900 | 1.2866 |
218
+ | 1.2895 | 13.6 | 17000 | 1.2844 |
219
+ | 1.28 | 13.68 | 17100 | 1.2834 |
220
+ | 1.2849 | 13.76 | 17200 | 1.2822 |
221
+ | 1.3136 | 13.84 | 17300 | 1.2828 |
222
+ | 1.2938 | 13.92 | 17400 | 1.2810 |
223
+ | 1.2994 | 14.0 | 17500 | 1.2803 |
224
+ | 1.3158 | 14.08 | 17600 | 1.2788 |
225
+ | 1.2783 | 14.16 | 17700 | 1.2779 |
226
+ | 1.2811 | 14.24 | 17800 | 1.2774 |
227
+ | 1.2824 | 14.32 | 17900 | 1.2771 |
228
+ | 1.2881 | 14.4 | 18000 | 1.2770 |
229
 
230
 
231
  ### Framework versions