hts98 commited on
Commit
c2f5a2f
·
1 Parent(s): ff8c784

End of training

Browse files
all_results.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 120.0,
3
+ "eval_accuracy": 0.8046732017950711,
4
+ "eval_f1": 0.6398059560706104,
5
+ "eval_loss": 1.7624890804290771,
6
+ "eval_precision": 0.6042249936370577,
7
+ "eval_recall": 0.6798396334478809,
8
+ "eval_runtime": 2.8827,
9
+ "eval_samples": 1112,
10
+ "eval_samples_per_second": 385.751,
11
+ "eval_steps_per_second": 6.244,
12
+ "predict_accuracy": 0.8020527140897511,
13
+ "predict_f1": 0.6292738631020353,
14
+ "predict_loss": 1.8004062175750732,
15
+ "predict_precision": 0.5881466599698644,
16
+ "predict_recall": 0.676585295392171,
17
+ "predict_runtime": 5.922,
18
+ "predict_samples_per_second": 375.715,
19
+ "predict_steps_per_second": 5.91,
20
+ "train_loss": 0.10161643302465072,
21
+ "train_runtime": 6791.1484,
22
+ "train_samples": 7785,
23
+ "train_samples_per_second": 137.561,
24
+ "train_steps_per_second": 2.156
25
+ }
eval_results.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 120.0,
3
+ "eval_accuracy": 0.8046732017950711,
4
+ "eval_f1": 0.6398059560706104,
5
+ "eval_loss": 1.7624890804290771,
6
+ "eval_precision": 0.6042249936370577,
7
+ "eval_recall": 0.6798396334478809,
8
+ "eval_runtime": 2.8827,
9
+ "eval_samples": 1112,
10
+ "eval_samples_per_second": 385.751,
11
+ "eval_steps_per_second": 6.244
12
+ }
predict_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "predict_accuracy": 0.8020527140897511,
3
+ "predict_f1": 0.6292738631020353,
4
+ "predict_loss": 1.8004062175750732,
5
+ "predict_precision": 0.5881466599698644,
6
+ "predict_recall": 0.676585295392171,
7
+ "predict_runtime": 5.922,
8
+ "predict_samples_per_second": 375.715,
9
+ "predict_steps_per_second": 5.91
10
+ }
predictions.txt ADDED
The diff for this file is too large to render. See raw diff
 
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 120.0,
3
+ "train_loss": 0.10161643302465072,
4
+ "train_runtime": 6791.1484,
5
+ "train_samples": 7785,
6
+ "train_samples_per_second": 137.561,
7
+ "train_steps_per_second": 2.156
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,1639 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.6398059560706104,
3
+ "best_model_checkpoint": "/tmp/test-ner1_roberta/checkpoint-13786",
4
+ "epoch": 120.0,
5
+ "global_step": 14640,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 1.0,
12
+ "eval_accuracy": 0.6466768621355329,
13
+ "eval_f1": 0.2566238574569015,
14
+ "eval_loss": 1.1791669130325317,
15
+ "eval_precision": 0.21529800038827412,
16
+ "eval_recall": 0.31758304696449025,
17
+ "eval_runtime": 2.9054,
18
+ "eval_samples_per_second": 382.73,
19
+ "eval_steps_per_second": 6.195,
20
+ "step": 122
21
+ },
22
+ {
23
+ "epoch": 2.0,
24
+ "eval_accuracy": 0.7298117181036428,
25
+ "eval_f1": 0.40685328185328185,
26
+ "eval_loss": 0.9055529236793518,
27
+ "eval_precision": 0.35154295246038364,
28
+ "eval_recall": 0.48281786941580757,
29
+ "eval_runtime": 2.852,
30
+ "eval_samples_per_second": 389.9,
31
+ "eval_steps_per_second": 6.311,
32
+ "step": 244
33
+ },
34
+ {
35
+ "epoch": 3.0,
36
+ "eval_accuracy": 0.7447037882016697,
37
+ "eval_f1": 0.4581746714162879,
38
+ "eval_loss": 0.8212449550628662,
39
+ "eval_precision": 0.401161540116154,
40
+ "eval_recall": 0.534077892325315,
41
+ "eval_runtime": 2.8628,
42
+ "eval_samples_per_second": 388.43,
43
+ "eval_steps_per_second": 6.288,
44
+ "step": 366
45
+ },
46
+ {
47
+ "epoch": 4.0,
48
+ "eval_accuracy": 0.7637576152631184,
49
+ "eval_f1": 0.48680641183723805,
50
+ "eval_loss": 0.7602469325065613,
51
+ "eval_precision": 0.4274577739281074,
52
+ "eval_recall": 0.5652920962199313,
53
+ "eval_runtime": 2.8995,
54
+ "eval_samples_per_second": 383.518,
55
+ "eval_steps_per_second": 6.208,
56
+ "step": 488
57
+ },
58
+ {
59
+ "epoch": 4.1,
60
+ "learning_rate": 2.8975409836065577e-05,
61
+ "loss": 1.0934,
62
+ "step": 500
63
+ },
64
+ {
65
+ "epoch": 5.0,
66
+ "eval_accuracy": 0.7645849524907865,
67
+ "eval_f1": 0.49326599326599335,
68
+ "eval_loss": 0.7660626769065857,
69
+ "eval_precision": 0.42516583747927034,
70
+ "eval_recall": 0.5873424971363116,
71
+ "eval_runtime": 2.8913,
72
+ "eval_samples_per_second": 384.608,
73
+ "eval_steps_per_second": 6.226,
74
+ "step": 610
75
+ },
76
+ {
77
+ "epoch": 6.0,
78
+ "eval_accuracy": 0.7731090330182766,
79
+ "eval_f1": 0.5357728706624606,
80
+ "eval_loss": 0.7474268674850464,
81
+ "eval_precision": 0.47890818858560796,
82
+ "eval_recall": 0.6079610538373424,
83
+ "eval_runtime": 2.8539,
84
+ "eval_samples_per_second": 389.638,
85
+ "eval_steps_per_second": 6.307,
86
+ "step": 732
87
+ },
88
+ {
89
+ "epoch": 7.0,
90
+ "eval_accuracy": 0.7785494020608218,
91
+ "eval_f1": 0.5398138572905895,
92
+ "eval_loss": 0.7386994957923889,
93
+ "eval_precision": 0.49198868991517436,
94
+ "eval_recall": 0.5979381443298969,
95
+ "eval_runtime": 2.8557,
96
+ "eval_samples_per_second": 389.403,
97
+ "eval_steps_per_second": 6.303,
98
+ "step": 854
99
+ },
100
+ {
101
+ "epoch": 8.0,
102
+ "eval_accuracy": 0.7813573344698774,
103
+ "eval_f1": 0.547186587069732,
104
+ "eval_loss": 0.7481865882873535,
105
+ "eval_precision": 0.4916685688199041,
106
+ "eval_recall": 0.6168384879725086,
107
+ "eval_runtime": 2.8531,
108
+ "eval_samples_per_second": 389.752,
109
+ "eval_steps_per_second": 6.309,
110
+ "step": 976
111
+ },
112
+ {
113
+ "epoch": 8.2,
114
+ "learning_rate": 2.795081967213115e-05,
115
+ "loss": 0.5404,
116
+ "step": 1000
117
+ },
118
+ {
119
+ "epoch": 9.0,
120
+ "eval_accuracy": 0.7782234813347707,
121
+ "eval_f1": 0.5538539425389192,
122
+ "eval_loss": 0.7774013876914978,
123
+ "eval_precision": 0.4962576547970061,
124
+ "eval_recall": 0.6265750286368843,
125
+ "eval_runtime": 3.061,
126
+ "eval_samples_per_second": 363.279,
127
+ "eval_steps_per_second": 5.88,
128
+ "step": 1098
129
+ },
130
+ {
131
+ "epoch": 10.0,
132
+ "eval_accuracy": 0.7801790056910773,
133
+ "eval_f1": 0.5599486521181001,
134
+ "eval_loss": 0.7820373177528381,
135
+ "eval_precision": 0.5074453234062355,
136
+ "eval_recall": 0.6245704467353952,
137
+ "eval_runtime": 2.9062,
138
+ "eval_samples_per_second": 382.636,
139
+ "eval_steps_per_second": 6.194,
140
+ "step": 1220
141
+ },
142
+ {
143
+ "epoch": 11.0,
144
+ "eval_accuracy": 0.7816832551959285,
145
+ "eval_f1": 0.5656822810590632,
146
+ "eval_loss": 0.7769992351531982,
147
+ "eval_precision": 0.5091659028414299,
148
+ "eval_recall": 0.63631156930126,
149
+ "eval_runtime": 2.8301,
150
+ "eval_samples_per_second": 392.925,
151
+ "eval_steps_per_second": 6.36,
152
+ "step": 1342
153
+ },
154
+ {
155
+ "epoch": 12.0,
156
+ "eval_accuracy": 0.7849173916313585,
157
+ "eval_f1": 0.5781455214079328,
158
+ "eval_loss": 0.8044845461845398,
159
+ "eval_precision": 0.5339640950994663,
160
+ "eval_recall": 0.6302978235967927,
161
+ "eval_runtime": 2.9058,
162
+ "eval_samples_per_second": 382.688,
163
+ "eval_steps_per_second": 6.195,
164
+ "step": 1464
165
+ },
166
+ {
167
+ "epoch": 12.3,
168
+ "learning_rate": 2.6926229508196725e-05,
169
+ "loss": 0.3509,
170
+ "step": 1500
171
+ },
172
+ {
173
+ "epoch": 13.0,
174
+ "eval_accuracy": 0.7871486950635546,
175
+ "eval_f1": 0.585631067961165,
176
+ "eval_loss": 0.8087980151176453,
177
+ "eval_precision": 0.5343727852586818,
178
+ "eval_recall": 0.647766323024055,
179
+ "eval_runtime": 2.8915,
180
+ "eval_samples_per_second": 384.581,
181
+ "eval_steps_per_second": 6.225,
182
+ "step": 1586
183
+ },
184
+ {
185
+ "epoch": 14.0,
186
+ "eval_accuracy": 0.7768195151302429,
187
+ "eval_f1": 0.565743073047859,
188
+ "eval_loss": 0.84703528881073,
189
+ "eval_precision": 0.5049460431654677,
190
+ "eval_recall": 0.643184421534937,
191
+ "eval_runtime": 2.8567,
192
+ "eval_samples_per_second": 389.267,
193
+ "eval_steps_per_second": 6.301,
194
+ "step": 1708
195
+ },
196
+ {
197
+ "epoch": 15.0,
198
+ "eval_accuracy": 0.7845914709053075,
199
+ "eval_f1": 0.5803108808290157,
200
+ "eval_loss": 0.835796058177948,
201
+ "eval_precision": 0.5298013245033113,
202
+ "eval_recall": 0.6414662084765178,
203
+ "eval_runtime": 2.9498,
204
+ "eval_samples_per_second": 376.98,
205
+ "eval_steps_per_second": 6.102,
206
+ "step": 1830
207
+ },
208
+ {
209
+ "epoch": 16.0,
210
+ "eval_accuracy": 0.7889287236442951,
211
+ "eval_f1": 0.5734914904589995,
212
+ "eval_loss": 0.8826290369033813,
213
+ "eval_precision": 0.5215759849906192,
214
+ "eval_recall": 0.6368843069873997,
215
+ "eval_runtime": 2.818,
216
+ "eval_samples_per_second": 394.606,
217
+ "eval_steps_per_second": 6.388,
218
+ "step": 1952
219
+ },
220
+ {
221
+ "epoch": 16.39,
222
+ "learning_rate": 2.5901639344262294e-05,
223
+ "loss": 0.2458,
224
+ "step": 2000
225
+ },
226
+ {
227
+ "epoch": 17.0,
228
+ "eval_accuracy": 0.7903577606739037,
229
+ "eval_f1": 0.590254852766407,
230
+ "eval_loss": 0.8949980735778809,
231
+ "eval_precision": 0.5476598872825288,
232
+ "eval_recall": 0.6400343642611683,
233
+ "eval_runtime": 2.8918,
234
+ "eval_samples_per_second": 384.538,
235
+ "eval_steps_per_second": 6.225,
236
+ "step": 2074
237
+ },
238
+ {
239
+ "epoch": 18.0,
240
+ "eval_accuracy": 0.7900067691227718,
241
+ "eval_f1": 0.5787797732772896,
242
+ "eval_loss": 0.8845651745796204,
243
+ "eval_precision": 0.5212204634090388,
244
+ "eval_recall": 0.6506300114547537,
245
+ "eval_runtime": 2.8519,
246
+ "eval_samples_per_second": 389.916,
247
+ "eval_steps_per_second": 6.312,
248
+ "step": 2196
249
+ },
250
+ {
251
+ "epoch": 19.0,
252
+ "eval_accuracy": 0.7903577606739037,
253
+ "eval_f1": 0.5841648308928341,
254
+ "eval_loss": 0.88877934217453,
255
+ "eval_precision": 0.533491124260355,
256
+ "eval_recall": 0.645475372279496,
257
+ "eval_runtime": 2.9333,
258
+ "eval_samples_per_second": 379.09,
259
+ "eval_steps_per_second": 6.136,
260
+ "step": 2318
261
+ },
262
+ {
263
+ "epoch": 20.0,
264
+ "eval_accuracy": 0.7915360894527039,
265
+ "eval_f1": 0.5802566116784498,
266
+ "eval_loss": 0.893390953540802,
267
+ "eval_precision": 0.5344910757356488,
268
+ "eval_recall": 0.6345933562428407,
269
+ "eval_runtime": 2.8278,
270
+ "eval_samples_per_second": 393.243,
271
+ "eval_steps_per_second": 6.365,
272
+ "step": 2440
273
+ },
274
+ {
275
+ "epoch": 20.49,
276
+ "learning_rate": 2.487704918032787e-05,
277
+ "loss": 0.1765,
278
+ "step": 2500
279
+ },
280
+ {
281
+ "epoch": 21.0,
282
+ "eval_accuracy": 0.7933411888585253,
283
+ "eval_f1": 0.5913089142707102,
284
+ "eval_loss": 0.9481694102287292,
285
+ "eval_precision": 0.545939393939394,
286
+ "eval_recall": 0.6449026345933563,
287
+ "eval_runtime": 2.8849,
288
+ "eval_samples_per_second": 385.458,
289
+ "eval_steps_per_second": 6.239,
290
+ "step": 2562
291
+ },
292
+ {
293
+ "epoch": 22.0,
294
+ "eval_accuracy": 0.7958984130167723,
295
+ "eval_f1": 0.5927963326784546,
296
+ "eval_loss": 0.9498738646507263,
297
+ "eval_precision": 0.5462225440502052,
298
+ "eval_recall": 0.6480526918671249,
299
+ "eval_runtime": 2.8695,
300
+ "eval_samples_per_second": 387.521,
301
+ "eval_steps_per_second": 6.273,
302
+ "step": 2684
303
+ },
304
+ {
305
+ "epoch": 23.0,
306
+ "eval_accuracy": 0.7868729159876652,
307
+ "eval_f1": 0.5899986735641332,
308
+ "eval_loss": 0.9826343059539795,
309
+ "eval_precision": 0.5495428712626637,
310
+ "eval_recall": 0.6368843069873997,
311
+ "eval_runtime": 2.8977,
312
+ "eval_samples_per_second": 383.754,
313
+ "eval_steps_per_second": 6.212,
314
+ "step": 2806
315
+ },
316
+ {
317
+ "epoch": 24.0,
318
+ "eval_accuracy": 0.7972522375711385,
319
+ "eval_f1": 0.607061350516848,
320
+ "eval_loss": 0.9814818501472473,
321
+ "eval_precision": 0.5713924690422036,
322
+ "eval_recall": 0.6474799541809851,
323
+ "eval_runtime": 2.8607,
324
+ "eval_samples_per_second": 388.722,
325
+ "eval_steps_per_second": 6.292,
326
+ "step": 2928
327
+ },
328
+ {
329
+ "epoch": 24.59,
330
+ "learning_rate": 2.3852459016393442e-05,
331
+ "loss": 0.1273,
332
+ "step": 3000
333
+ },
334
+ {
335
+ "epoch": 25.0,
336
+ "eval_accuracy": 0.7970516709704917,
337
+ "eval_f1": 0.6025760191209667,
338
+ "eval_loss": 1.007980227470398,
339
+ "eval_precision": 0.5617727160188165,
340
+ "eval_recall": 0.6497709049255441,
341
+ "eval_runtime": 2.9873,
342
+ "eval_samples_per_second": 372.236,
343
+ "eval_steps_per_second": 6.025,
344
+ "step": 3050
345
+ },
346
+ {
347
+ "epoch": 26.0,
348
+ "eval_accuracy": 0.7940431719607892,
349
+ "eval_f1": 0.5959833072509129,
350
+ "eval_loss": 1.0463485717773438,
351
+ "eval_precision": 0.5471743295019157,
352
+ "eval_recall": 0.6543528064146621,
353
+ "eval_runtime": 2.8597,
354
+ "eval_samples_per_second": 388.851,
355
+ "eval_steps_per_second": 6.294,
356
+ "step": 3172
357
+ },
358
+ {
359
+ "epoch": 27.0,
360
+ "eval_accuracy": 0.7965753252939555,
361
+ "eval_f1": 0.5990990990990992,
362
+ "eval_loss": 1.0348948240280151,
363
+ "eval_precision": 0.5574457593688363,
364
+ "eval_recall": 0.6474799541809851,
365
+ "eval_runtime": 2.8692,
366
+ "eval_samples_per_second": 387.57,
367
+ "eval_steps_per_second": 6.274,
368
+ "step": 3294
369
+ },
370
+ {
371
+ "epoch": 28.0,
372
+ "eval_accuracy": 0.7898312733472058,
373
+ "eval_f1": 0.596569333507922,
374
+ "eval_loss": 1.0559194087982178,
375
+ "eval_precision": 0.549577804583836,
376
+ "eval_recall": 0.652348224513173,
377
+ "eval_runtime": 2.8607,
378
+ "eval_samples_per_second": 388.718,
379
+ "eval_steps_per_second": 6.292,
380
+ "step": 3416
381
+ },
382
+ {
383
+ "epoch": 28.69,
384
+ "learning_rate": 2.2827868852459018e-05,
385
+ "loss": 0.0951,
386
+ "step": 3500
387
+ },
388
+ {
389
+ "epoch": 29.0,
390
+ "eval_accuracy": 0.7917867977035125,
391
+ "eval_f1": 0.592843201040989,
392
+ "eval_loss": 1.0900899171829224,
393
+ "eval_precision": 0.5432864297638922,
394
+ "eval_recall": 0.652348224513173,
395
+ "eval_runtime": 2.9162,
396
+ "eval_samples_per_second": 381.323,
397
+ "eval_steps_per_second": 6.173,
398
+ "step": 3538
399
+ },
400
+ {
401
+ "epoch": 30.0,
402
+ "eval_accuracy": 0.7840649835786095,
403
+ "eval_f1": 0.5850218004616569,
404
+ "eval_loss": 1.1399974822998047,
405
+ "eval_precision": 0.5297259637714816,
406
+ "eval_recall": 0.6532073310423826,
407
+ "eval_runtime": 2.8324,
408
+ "eval_samples_per_second": 392.593,
409
+ "eval_steps_per_second": 6.355,
410
+ "step": 3660
411
+ },
412
+ {
413
+ "epoch": 31.0,
414
+ "eval_accuracy": 0.7916113019279465,
415
+ "eval_f1": 0.59593375521745,
416
+ "eval_loss": 1.1601282358169556,
417
+ "eval_precision": 0.5623888182973317,
418
+ "eval_recall": 0.6337342497136311,
419
+ "eval_runtime": 2.8109,
420
+ "eval_samples_per_second": 395.6,
421
+ "eval_steps_per_second": 6.404,
422
+ "step": 3782
423
+ },
424
+ {
425
+ "epoch": 32.0,
426
+ "eval_accuracy": 0.7882768821921929,
427
+ "eval_f1": 0.5970422719539328,
428
+ "eval_loss": 1.135899305343628,
429
+ "eval_precision": 0.549771029163654,
430
+ "eval_recall": 0.6532073310423826,
431
+ "eval_runtime": 2.8146,
432
+ "eval_samples_per_second": 395.088,
433
+ "eval_steps_per_second": 6.395,
434
+ "step": 3904
435
+ },
436
+ {
437
+ "epoch": 32.79,
438
+ "learning_rate": 2.180327868852459e-05,
439
+ "loss": 0.0717,
440
+ "step": 4000
441
+ },
442
+ {
443
+ "epoch": 33.0,
444
+ "eval_accuracy": 0.7965001128187129,
445
+ "eval_f1": 0.6057882912647021,
446
+ "eval_loss": 1.1268887519836426,
447
+ "eval_precision": 0.5624539877300614,
448
+ "eval_recall": 0.6563573883161512,
449
+ "eval_runtime": 2.8737,
450
+ "eval_samples_per_second": 386.952,
451
+ "eval_steps_per_second": 6.264,
452
+ "step": 4026
453
+ },
454
+ {
455
+ "epoch": 34.0,
456
+ "eval_accuracy": 0.7966756085942789,
457
+ "eval_f1": 0.6037027099543869,
458
+ "eval_loss": 1.1757946014404297,
459
+ "eval_precision": 0.5678950025239778,
460
+ "eval_recall": 0.6443298969072165,
461
+ "eval_runtime": 2.8674,
462
+ "eval_samples_per_second": 387.814,
463
+ "eval_steps_per_second": 6.278,
464
+ "step": 4148
465
+ },
466
+ {
467
+ "epoch": 35.0,
468
+ "eval_accuracy": 0.7897811316970441,
469
+ "eval_f1": 0.5957000524383849,
470
+ "eval_loss": 1.1870158910751343,
471
+ "eval_precision": 0.5493230174081238,
472
+ "eval_recall": 0.6506300114547537,
473
+ "eval_runtime": 2.9346,
474
+ "eval_samples_per_second": 378.932,
475
+ "eval_steps_per_second": 6.134,
476
+ "step": 4270
477
+ },
478
+ {
479
+ "epoch": 36.0,
480
+ "eval_accuracy": 0.7928397723569083,
481
+ "eval_f1": 0.5987710811870832,
482
+ "eval_loss": 1.129560947418213,
483
+ "eval_precision": 0.5508780370459466,
484
+ "eval_recall": 0.6557846506300115,
485
+ "eval_runtime": 2.9245,
486
+ "eval_samples_per_second": 380.236,
487
+ "eval_steps_per_second": 6.155,
488
+ "step": 4392
489
+ },
490
+ {
491
+ "epoch": 36.89,
492
+ "learning_rate": 2.0778688524590166e-05,
493
+ "loss": 0.0552,
494
+ "step": 4500
495
+ },
496
+ {
497
+ "epoch": 37.0,
498
+ "eval_accuracy": 0.790307619023742,
499
+ "eval_f1": 0.5933980582524273,
500
+ "eval_loss": 1.2164160013198853,
501
+ "eval_precision": 0.5414599574769667,
502
+ "eval_recall": 0.6563573883161512,
503
+ "eval_runtime": 2.8667,
504
+ "eval_samples_per_second": 387.897,
505
+ "eval_steps_per_second": 6.279,
506
+ "step": 4514
507
+ },
508
+ {
509
+ "epoch": 38.0,
510
+ "eval_accuracy": 0.7943440218617595,
511
+ "eval_f1": 0.6001566988769914,
512
+ "eval_loss": 1.2046586275100708,
513
+ "eval_precision": 0.5516082573211714,
514
+ "eval_recall": 0.6580756013745704,
515
+ "eval_runtime": 2.8635,
516
+ "eval_samples_per_second": 388.332,
517
+ "eval_steps_per_second": 6.286,
518
+ "step": 4636
519
+ },
520
+ {
521
+ "epoch": 39.0,
522
+ "eval_accuracy": 0.7949206508386191,
523
+ "eval_f1": 0.6084432717678101,
524
+ "eval_loss": 1.2363812923431396,
525
+ "eval_precision": 0.5640900195694716,
526
+ "eval_recall": 0.6603665521191294,
527
+ "eval_runtime": 2.8842,
528
+ "eval_samples_per_second": 385.555,
529
+ "eval_steps_per_second": 6.241,
530
+ "step": 4758
531
+ },
532
+ {
533
+ "epoch": 40.0,
534
+ "eval_accuracy": 0.7944693759871637,
535
+ "eval_f1": 0.6042486231313926,
536
+ "eval_loss": 1.248107671737671,
537
+ "eval_precision": 0.5573294629898403,
538
+ "eval_recall": 0.6597938144329897,
539
+ "eval_runtime": 2.9808,
540
+ "eval_samples_per_second": 373.05,
541
+ "eval_steps_per_second": 6.039,
542
+ "step": 4880
543
+ },
544
+ {
545
+ "epoch": 40.98,
546
+ "learning_rate": 1.975409836065574e-05,
547
+ "loss": 0.0432,
548
+ "step": 5000
549
+ },
550
+ {
551
+ "epoch": 41.0,
552
+ "eval_accuracy": 0.7926141349311806,
553
+ "eval_f1": 0.6043454935622318,
554
+ "eval_loss": 1.276792287826538,
555
+ "eval_precision": 0.5683652875882946,
556
+ "eval_recall": 0.6451890034364262,
557
+ "eval_runtime": 2.8846,
558
+ "eval_samples_per_second": 385.494,
559
+ "eval_steps_per_second": 6.24,
560
+ "step": 5002
561
+ },
562
+ {
563
+ "epoch": 42.0,
564
+ "eval_accuracy": 0.7957981297164489,
565
+ "eval_f1": 0.6079725448785638,
566
+ "eval_loss": 1.2605416774749756,
567
+ "eval_precision": 0.5639079333986288,
568
+ "eval_recall": 0.6595074455899198,
569
+ "eval_runtime": 2.8333,
570
+ "eval_samples_per_second": 392.474,
571
+ "eval_steps_per_second": 6.353,
572
+ "step": 5124
573
+ },
574
+ {
575
+ "epoch": 43.0,
576
+ "eval_accuracy": 0.7974528041717853,
577
+ "eval_f1": 0.6125862984599043,
578
+ "eval_loss": 1.249541163444519,
579
+ "eval_precision": 0.571039603960396,
580
+ "eval_recall": 0.6606529209621993,
581
+ "eval_runtime": 2.8379,
582
+ "eval_samples_per_second": 391.839,
583
+ "eval_steps_per_second": 6.343,
584
+ "step": 5246
585
+ },
586
+ {
587
+ "epoch": 44.0,
588
+ "eval_accuracy": 0.7954722089903978,
589
+ "eval_f1": 0.6102653913512056,
590
+ "eval_loss": 1.2717816829681396,
591
+ "eval_precision": 0.5761892648181124,
592
+ "eval_recall": 0.6486254295532646,
593
+ "eval_runtime": 2.8387,
594
+ "eval_samples_per_second": 391.732,
595
+ "eval_steps_per_second": 6.341,
596
+ "step": 5368
597
+ },
598
+ {
599
+ "epoch": 45.0,
600
+ "eval_accuracy": 0.8003108782310026,
601
+ "eval_f1": 0.6093247588424437,
602
+ "eval_loss": 1.2998257875442505,
603
+ "eval_precision": 0.5725075528700906,
604
+ "eval_recall": 0.6512027491408935,
605
+ "eval_runtime": 2.8997,
606
+ "eval_samples_per_second": 383.49,
607
+ "eval_steps_per_second": 6.208,
608
+ "step": 5490
609
+ },
610
+ {
611
+ "epoch": 45.08,
612
+ "learning_rate": 1.872950819672131e-05,
613
+ "loss": 0.0331,
614
+ "step": 5500
615
+ },
616
+ {
617
+ "epoch": 46.0,
618
+ "eval_accuracy": 0.7945696592874871,
619
+ "eval_f1": 0.6008263361322138,
620
+ "eval_loss": 1.3468672037124634,
621
+ "eval_precision": 0.5619546247818499,
622
+ "eval_recall": 0.645475372279496,
623
+ "eval_runtime": 2.896,
624
+ "eval_samples_per_second": 383.974,
625
+ "eval_steps_per_second": 6.215,
626
+ "step": 5612
627
+ },
628
+ {
629
+ "epoch": 47.0,
630
+ "eval_accuracy": 0.8010128613332664,
631
+ "eval_f1": 0.6131348045732518,
632
+ "eval_loss": 1.3357452154159546,
633
+ "eval_precision": 0.5722084367245658,
634
+ "eval_recall": 0.6603665521191294,
635
+ "eval_runtime": 2.8883,
636
+ "eval_samples_per_second": 384.996,
637
+ "eval_steps_per_second": 6.232,
638
+ "step": 5734
639
+ },
640
+ {
641
+ "epoch": 48.0,
642
+ "eval_accuracy": 0.7936420387594956,
643
+ "eval_f1": 0.6039968445963714,
644
+ "eval_loss": 1.3576422929763794,
645
+ "eval_precision": 0.5583373845405931,
646
+ "eval_recall": 0.6577892325315006,
647
+ "eval_runtime": 2.8265,
648
+ "eval_samples_per_second": 393.421,
649
+ "eval_steps_per_second": 6.368,
650
+ "step": 5856
651
+ },
652
+ {
653
+ "epoch": 49.0,
654
+ "eval_accuracy": 0.7985057788251811,
655
+ "eval_f1": 0.6147880732718278,
656
+ "eval_loss": 1.3396836519241333,
657
+ "eval_precision": 0.5766240280912968,
658
+ "eval_recall": 0.6583619702176403,
659
+ "eval_runtime": 2.908,
660
+ "eval_samples_per_second": 382.397,
661
+ "eval_steps_per_second": 6.19,
662
+ "step": 5978
663
+ },
664
+ {
665
+ "epoch": 49.18,
666
+ "learning_rate": 1.7704918032786887e-05,
667
+ "loss": 0.0265,
668
+ "step": 6000
669
+ },
670
+ {
671
+ "epoch": 50.0,
672
+ "eval_accuracy": 0.7961240504425,
673
+ "eval_f1": 0.6078405315614618,
674
+ "eval_loss": 1.3641352653503418,
675
+ "eval_precision": 0.5670716588147781,
676
+ "eval_recall": 0.6549255441008018,
677
+ "eval_runtime": 2.9731,
678
+ "eval_samples_per_second": 374.026,
679
+ "eval_steps_per_second": 6.054,
680
+ "step": 6100
681
+ },
682
+ {
683
+ "epoch": 51.0,
684
+ "eval_accuracy": 0.7937924637099807,
685
+ "eval_f1": 0.6058036305816881,
686
+ "eval_loss": 1.3726739883422852,
687
+ "eval_precision": 0.5637484586929716,
688
+ "eval_recall": 0.654639175257732,
689
+ "eval_runtime": 2.8737,
690
+ "eval_samples_per_second": 386.957,
691
+ "eval_steps_per_second": 6.264,
692
+ "step": 6222
693
+ },
694
+ {
695
+ "epoch": 52.0,
696
+ "eval_accuracy": 0.7926642765813423,
697
+ "eval_f1": 0.6081809811916349,
698
+ "eval_loss": 1.4024547338485718,
699
+ "eval_precision": 0.5623935782048164,
700
+ "eval_recall": 0.6620847651775487,
701
+ "eval_runtime": 2.8503,
702
+ "eval_samples_per_second": 390.136,
703
+ "eval_steps_per_second": 6.315,
704
+ "step": 6344
705
+ },
706
+ {
707
+ "epoch": 53.0,
708
+ "eval_accuracy": 0.7915862311028656,
709
+ "eval_f1": 0.6084185367149118,
710
+ "eval_loss": 1.399110198020935,
711
+ "eval_precision": 0.5672196088140629,
712
+ "eval_recall": 0.6560710194730813,
713
+ "eval_runtime": 2.9033,
714
+ "eval_samples_per_second": 383.006,
715
+ "eval_steps_per_second": 6.2,
716
+ "step": 6466
717
+ },
718
+ {
719
+ "epoch": 53.28,
720
+ "learning_rate": 1.668032786885246e-05,
721
+ "loss": 0.0212,
722
+ "step": 6500
723
+ },
724
+ {
725
+ "epoch": 54.0,
726
+ "eval_accuracy": 0.7953217840399127,
727
+ "eval_f1": 0.6119815668202765,
728
+ "eval_loss": 1.4267858266830444,
729
+ "eval_precision": 0.5664148184255423,
730
+ "eval_recall": 0.6655211912943871,
731
+ "eval_runtime": 2.9191,
732
+ "eval_samples_per_second": 380.935,
733
+ "eval_steps_per_second": 6.166,
734
+ "step": 6588
735
+ },
736
+ {
737
+ "epoch": 55.0,
738
+ "eval_accuracy": 0.7943690926868403,
739
+ "eval_f1": 0.6073173953242637,
740
+ "eval_loss": 1.4376713037490845,
741
+ "eval_precision": 0.5636185339544005,
742
+ "eval_recall": 0.6583619702176403,
743
+ "eval_runtime": 3.0593,
744
+ "eval_samples_per_second": 363.477,
745
+ "eval_steps_per_second": 5.884,
746
+ "step": 6710
747
+ },
748
+ {
749
+ "epoch": 56.0,
750
+ "eval_accuracy": 0.795271642389751,
751
+ "eval_f1": 0.6113687557970054,
752
+ "eval_loss": 1.4307470321655273,
753
+ "eval_precision": 0.5689272503082614,
754
+ "eval_recall": 0.6606529209621993,
755
+ "eval_runtime": 2.9231,
756
+ "eval_samples_per_second": 380.418,
757
+ "eval_steps_per_second": 6.158,
758
+ "step": 6832
759
+ },
760
+ {
761
+ "epoch": 57.0,
762
+ "eval_accuracy": 0.7917115852282699,
763
+ "eval_f1": 0.6049250535331906,
764
+ "eval_loss": 1.4772576093673706,
765
+ "eval_precision": 0.5678391959798995,
766
+ "eval_recall": 0.6471935853379153,
767
+ "eval_runtime": 2.9801,
768
+ "eval_samples_per_second": 373.136,
769
+ "eval_steps_per_second": 6.04,
770
+ "step": 6954
771
+ },
772
+ {
773
+ "epoch": 57.38,
774
+ "learning_rate": 1.5655737704918035e-05,
775
+ "loss": 0.0171,
776
+ "step": 7000
777
+ },
778
+ {
779
+ "epoch": 58.0,
780
+ "eval_accuracy": 0.7976032291222704,
781
+ "eval_f1": 0.6215258855585831,
782
+ "eval_loss": 1.46255362033844,
783
+ "eval_precision": 0.5927754677754677,
784
+ "eval_recall": 0.6532073310423826,
785
+ "eval_runtime": 2.8727,
786
+ "eval_samples_per_second": 387.093,
787
+ "eval_steps_per_second": 6.266,
788
+ "step": 7076
789
+ },
790
+ {
791
+ "epoch": 59.0,
792
+ "eval_accuracy": 0.7991074786271216,
793
+ "eval_f1": 0.6118965057348627,
794
+ "eval_loss": 1.4488873481750488,
795
+ "eval_precision": 0.5726410384423365,
796
+ "eval_recall": 0.656930126002291,
797
+ "eval_runtime": 2.9083,
798
+ "eval_samples_per_second": 382.356,
799
+ "eval_steps_per_second": 6.189,
800
+ "step": 7198
801
+ },
802
+ {
803
+ "epoch": 60.0,
804
+ "eval_accuracy": 0.8010379321583473,
805
+ "eval_f1": 0.6189835977413283,
806
+ "eval_loss": 1.447924017906189,
807
+ "eval_precision": 0.5833755701976685,
808
+ "eval_recall": 0.6592210767468499,
809
+ "eval_runtime": 2.8818,
810
+ "eval_samples_per_second": 385.865,
811
+ "eval_steps_per_second": 6.246,
812
+ "step": 7320
813
+ },
814
+ {
815
+ "epoch": 61.0,
816
+ "eval_accuracy": 0.7975530874721087,
817
+ "eval_f1": 0.6155924875016889,
818
+ "eval_loss": 1.464934229850769,
819
+ "eval_precision": 0.5827577385520594,
820
+ "eval_recall": 0.652348224513173,
821
+ "eval_runtime": 3.0162,
822
+ "eval_samples_per_second": 368.671,
823
+ "eval_steps_per_second": 5.968,
824
+ "step": 7442
825
+ },
826
+ {
827
+ "epoch": 61.48,
828
+ "learning_rate": 1.4631147540983607e-05,
829
+ "loss": 0.0142,
830
+ "step": 7500
831
+ },
832
+ {
833
+ "epoch": 62.0,
834
+ "eval_accuracy": 0.8005866573068919,
835
+ "eval_f1": 0.617394747261449,
836
+ "eval_loss": 1.5170141458511353,
837
+ "eval_precision": 0.5725826193390453,
838
+ "eval_recall": 0.6698167239404352,
839
+ "eval_runtime": 2.8537,
840
+ "eval_samples_per_second": 389.666,
841
+ "eval_steps_per_second": 6.308,
842
+ "step": 7564
843
+ },
844
+ {
845
+ "epoch": 63.0,
846
+ "eval_accuracy": 0.798530849650262,
847
+ "eval_f1": 0.6186992951190318,
848
+ "eval_loss": 1.486588478088379,
849
+ "eval_precision": 0.5776011919543084,
850
+ "eval_recall": 0.6660939289805269,
851
+ "eval_runtime": 2.9317,
852
+ "eval_samples_per_second": 379.306,
853
+ "eval_steps_per_second": 6.14,
854
+ "step": 7686
855
+ },
856
+ {
857
+ "epoch": 64.0,
858
+ "eval_accuracy": 0.8009877905081856,
859
+ "eval_f1": 0.6169074371321562,
860
+ "eval_loss": 1.544573426246643,
861
+ "eval_precision": 0.5788152610441767,
862
+ "eval_recall": 0.6603665521191294,
863
+ "eval_runtime": 2.9763,
864
+ "eval_samples_per_second": 373.624,
865
+ "eval_steps_per_second": 6.048,
866
+ "step": 7808
867
+ },
868
+ {
869
+ "epoch": 65.0,
870
+ "eval_accuracy": 0.7935166846340913,
871
+ "eval_f1": 0.6111037498343711,
872
+ "eval_loss": 1.5565959215164185,
873
+ "eval_precision": 0.5686806411837237,
874
+ "eval_recall": 0.6603665521191294,
875
+ "eval_runtime": 2.8561,
876
+ "eval_samples_per_second": 389.345,
877
+ "eval_steps_per_second": 6.302,
878
+ "step": 7930
879
+ },
880
+ {
881
+ "epoch": 65.57,
882
+ "learning_rate": 1.3606557377049181e-05,
883
+ "loss": 0.0114,
884
+ "step": 8000
885
+ },
886
+ {
887
+ "epoch": 66.0,
888
+ "eval_accuracy": 0.795948554666934,
889
+ "eval_f1": 0.6242587601078167,
890
+ "eval_loss": 1.5454256534576416,
891
+ "eval_precision": 0.5896130346232179,
892
+ "eval_recall": 0.6632302405498282,
893
+ "eval_runtime": 2.9632,
894
+ "eval_samples_per_second": 375.274,
895
+ "eval_steps_per_second": 6.075,
896
+ "step": 8052
897
+ },
898
+ {
899
+ "epoch": 67.0,
900
+ "eval_accuracy": 0.7998345325544663,
901
+ "eval_f1": 0.6325366648560564,
902
+ "eval_loss": 1.5341241359710693,
903
+ "eval_precision": 0.6014979338842975,
904
+ "eval_recall": 0.6669530355097365,
905
+ "eval_runtime": 2.8476,
906
+ "eval_samples_per_second": 390.501,
907
+ "eval_steps_per_second": 6.321,
908
+ "step": 8174
909
+ },
910
+ {
911
+ "epoch": 68.0,
912
+ "eval_accuracy": 0.7963246170431469,
913
+ "eval_f1": 0.6196650459211237,
914
+ "eval_loss": 1.5298110246658325,
915
+ "eval_precision": 0.58640081799591,
916
+ "eval_recall": 0.656930126002291,
917
+ "eval_runtime": 2.9457,
918
+ "eval_samples_per_second": 377.495,
919
+ "eval_steps_per_second": 6.111,
920
+ "step": 8296
921
+ },
922
+ {
923
+ "epoch": 69.0,
924
+ "eval_accuracy": 0.7943941635119212,
925
+ "eval_f1": 0.617556946849607,
926
+ "eval_loss": 1.5693646669387817,
927
+ "eval_precision": 0.57733499377335,
928
+ "eval_recall": 0.6638029782359679,
929
+ "eval_runtime": 2.8694,
930
+ "eval_samples_per_second": 387.533,
931
+ "eval_steps_per_second": 6.273,
932
+ "step": 8418
933
+ },
934
+ {
935
+ "epoch": 69.67,
936
+ "learning_rate": 1.2581967213114756e-05,
937
+ "loss": 0.0101,
938
+ "step": 8500
939
+ },
940
+ {
941
+ "epoch": 70.0,
942
+ "eval_accuracy": 0.7977285832476747,
943
+ "eval_f1": 0.6197596795727636,
944
+ "eval_loss": 1.591435432434082,
945
+ "eval_precision": 0.5805402701350675,
946
+ "eval_recall": 0.6646620847651775,
947
+ "eval_runtime": 2.9323,
948
+ "eval_samples_per_second": 379.23,
949
+ "eval_steps_per_second": 6.139,
950
+ "step": 8540
951
+ },
952
+ {
953
+ "epoch": 71.0,
954
+ "eval_accuracy": 0.7956978464161255,
955
+ "eval_f1": 0.6129676474504061,
956
+ "eval_loss": 1.568572998046875,
957
+ "eval_precision": 0.5727792983329186,
958
+ "eval_recall": 0.6592210767468499,
959
+ "eval_runtime": 2.9544,
960
+ "eval_samples_per_second": 376.394,
961
+ "eval_steps_per_second": 6.093,
962
+ "step": 8662
963
+ },
964
+ {
965
+ "epoch": 72.0,
966
+ "eval_accuracy": 0.7949206508386191,
967
+ "eval_f1": 0.6126834381551364,
968
+ "eval_loss": 1.6199277639389038,
969
+ "eval_precision": 0.5647342995169082,
970
+ "eval_recall": 0.6695303550973654,
971
+ "eval_runtime": 2.9153,
972
+ "eval_samples_per_second": 381.436,
973
+ "eval_steps_per_second": 6.174,
974
+ "step": 8784
975
+ },
976
+ {
977
+ "epoch": 73.0,
978
+ "eval_accuracy": 0.7943941635119212,
979
+ "eval_f1": 0.623042954636692,
980
+ "eval_loss": 1.634416103363037,
981
+ "eval_precision": 0.584777694046722,
982
+ "eval_recall": 0.6666666666666666,
983
+ "eval_runtime": 2.8778,
984
+ "eval_samples_per_second": 386.41,
985
+ "eval_steps_per_second": 6.255,
986
+ "step": 8906
987
+ },
988
+ {
989
+ "epoch": 73.77,
990
+ "learning_rate": 1.1557377049180328e-05,
991
+ "loss": 0.0079,
992
+ "step": 9000
993
+ },
994
+ {
995
+ "epoch": 74.0,
996
+ "eval_accuracy": 0.7969513876701683,
997
+ "eval_f1": 0.6292225201072386,
998
+ "eval_loss": 1.557986855506897,
999
+ "eval_precision": 0.5914818548387096,
1000
+ "eval_recall": 0.6721076746849943,
1001
+ "eval_runtime": 2.8737,
1002
+ "eval_samples_per_second": 386.956,
1003
+ "eval_steps_per_second": 6.264,
1004
+ "step": 9028
1005
+ },
1006
+ {
1007
+ "epoch": 75.0,
1008
+ "eval_accuracy": 0.8006117281319728,
1009
+ "eval_f1": 0.6340199154276359,
1010
+ "eval_loss": 1.6272269487380981,
1011
+ "eval_precision": 0.6053659807241469,
1012
+ "eval_recall": 0.6655211912943871,
1013
+ "eval_runtime": 2.8776,
1014
+ "eval_samples_per_second": 386.439,
1015
+ "eval_steps_per_second": 6.255,
1016
+ "step": 9150
1017
+ },
1018
+ {
1019
+ "epoch": 76.0,
1020
+ "eval_accuracy": 0.7967758918946023,
1021
+ "eval_f1": 0.6184052357419527,
1022
+ "eval_loss": 1.6266722679138184,
1023
+ "eval_precision": 0.5794743429286608,
1024
+ "eval_recall": 0.6629438717067583,
1025
+ "eval_runtime": 2.9302,
1026
+ "eval_samples_per_second": 379.493,
1027
+ "eval_steps_per_second": 6.143,
1028
+ "step": 9272
1029
+ },
1030
+ {
1031
+ "epoch": 77.0,
1032
+ "eval_accuracy": 0.7958984130167723,
1033
+ "eval_f1": 0.6194760518655729,
1034
+ "eval_loss": 1.6500694751739502,
1035
+ "eval_precision": 0.5757501229709788,
1036
+ "eval_recall": 0.670389461626575,
1037
+ "eval_runtime": 2.8781,
1038
+ "eval_samples_per_second": 386.368,
1039
+ "eval_steps_per_second": 6.254,
1040
+ "step": 9394
1041
+ },
1042
+ {
1043
+ "epoch": 77.87,
1044
+ "learning_rate": 1.0532786885245902e-05,
1045
+ "loss": 0.0065,
1046
+ "step": 9500
1047
+ },
1048
+ {
1049
+ "epoch": 78.0,
1050
+ "eval_accuracy": 0.7995336826534961,
1051
+ "eval_f1": 0.630329195898543,
1052
+ "eval_loss": 1.6222110986709595,
1053
+ "eval_precision": 0.5959183673469388,
1054
+ "eval_recall": 0.6689576174112256,
1055
+ "eval_runtime": 2.9051,
1056
+ "eval_samples_per_second": 382.773,
1057
+ "eval_steps_per_second": 6.196,
1058
+ "step": 9516
1059
+ },
1060
+ {
1061
+ "epoch": 79.0,
1062
+ "eval_accuracy": 0.7965502544688746,
1063
+ "eval_f1": 0.6257701580498258,
1064
+ "eval_loss": 1.6543381214141846,
1065
+ "eval_precision": 0.587820835430297,
1066
+ "eval_recall": 0.6689576174112256,
1067
+ "eval_runtime": 2.9225,
1068
+ "eval_samples_per_second": 380.492,
1069
+ "eval_steps_per_second": 6.159,
1070
+ "step": 9638
1071
+ },
1072
+ {
1073
+ "epoch": 80.0,
1074
+ "eval_accuracy": 0.8008875072078622,
1075
+ "eval_f1": 0.6276252019386107,
1076
+ "eval_loss": 1.605409026145935,
1077
+ "eval_precision": 0.5922256097560976,
1078
+ "eval_recall": 0.6675257731958762,
1079
+ "eval_runtime": 2.8935,
1080
+ "eval_samples_per_second": 384.305,
1081
+ "eval_steps_per_second": 6.221,
1082
+ "step": 9760
1083
+ },
1084
+ {
1085
+ "epoch": 81.0,
1086
+ "eval_accuracy": 0.8007872239075388,
1087
+ "eval_f1": 0.6293103448275862,
1088
+ "eval_loss": 1.6386600732803345,
1089
+ "eval_precision": 0.5940996948118006,
1090
+ "eval_recall": 0.6689576174112256,
1091
+ "eval_runtime": 2.9359,
1092
+ "eval_samples_per_second": 378.763,
1093
+ "eval_steps_per_second": 6.131,
1094
+ "step": 9882
1095
+ },
1096
+ {
1097
+ "epoch": 81.97,
1098
+ "learning_rate": 9.508196721311476e-06,
1099
+ "loss": 0.0053,
1100
+ "step": 10000
1101
+ },
1102
+ {
1103
+ "epoch": 82.0,
1104
+ "eval_accuracy": 0.8047233434452328,
1105
+ "eval_f1": 0.6390403489640131,
1106
+ "eval_loss": 1.6452571153640747,
1107
+ "eval_precision": 0.6097814776274714,
1108
+ "eval_recall": 0.6712485681557846,
1109
+ "eval_runtime": 2.9102,
1110
+ "eval_samples_per_second": 382.104,
1111
+ "eval_steps_per_second": 6.185,
1112
+ "step": 10004
1113
+ },
1114
+ {
1115
+ "epoch": 83.0,
1116
+ "eval_accuracy": 0.8004613031814877,
1117
+ "eval_f1": 0.6212403513441574,
1118
+ "eval_loss": 1.679402232170105,
1119
+ "eval_precision": 0.5803083043262058,
1120
+ "eval_recall": 0.6683848797250859,
1121
+ "eval_runtime": 2.9425,
1122
+ "eval_samples_per_second": 377.908,
1123
+ "eval_steps_per_second": 6.117,
1124
+ "step": 10126
1125
+ },
1126
+ {
1127
+ "epoch": 84.0,
1128
+ "eval_accuracy": 0.7990322661518791,
1129
+ "eval_f1": 0.6314366806325179,
1130
+ "eval_loss": 1.700645923614502,
1131
+ "eval_precision": 0.5979012029690299,
1132
+ "eval_recall": 0.6689576174112256,
1133
+ "eval_runtime": 2.8808,
1134
+ "eval_samples_per_second": 386.008,
1135
+ "eval_steps_per_second": 6.248,
1136
+ "step": 10248
1137
+ },
1138
+ {
1139
+ "epoch": 85.0,
1140
+ "eval_accuracy": 0.7989069120264748,
1141
+ "eval_f1": 0.6296992481203008,
1142
+ "eval_loss": 1.682008147239685,
1143
+ "eval_precision": 0.5927704752275025,
1144
+ "eval_recall": 0.6715349369988545,
1145
+ "eval_runtime": 2.8543,
1146
+ "eval_samples_per_second": 389.584,
1147
+ "eval_steps_per_second": 6.306,
1148
+ "step": 10370
1149
+ },
1150
+ {
1151
+ "epoch": 86.0,
1152
+ "eval_accuracy": 0.7983052122245343,
1153
+ "eval_f1": 0.6285100094048098,
1154
+ "eval_loss": 1.6995329856872559,
1155
+ "eval_precision": 0.5920020248038471,
1156
+ "eval_recall": 0.6698167239404352,
1157
+ "eval_runtime": 3.0037,
1158
+ "eval_samples_per_second": 370.211,
1159
+ "eval_steps_per_second": 5.993,
1160
+ "step": 10492
1161
+ },
1162
+ {
1163
+ "epoch": 86.07,
1164
+ "learning_rate": 8.483606557377049e-06,
1165
+ "loss": 0.0045,
1166
+ "step": 10500
1167
+ },
1168
+ {
1169
+ "epoch": 87.0,
1170
+ "eval_accuracy": 0.8004863740065685,
1171
+ "eval_f1": 0.6253886710828714,
1172
+ "eval_loss": 1.6652146577835083,
1173
+ "eval_precision": 0.5923175416133163,
1174
+ "eval_recall": 0.6623711340206185,
1175
+ "eval_runtime": 2.9086,
1176
+ "eval_samples_per_second": 382.312,
1177
+ "eval_steps_per_second": 6.189,
1178
+ "step": 10614
1179
+ },
1180
+ {
1181
+ "epoch": 88.0,
1182
+ "eval_accuracy": 0.7990824078020408,
1183
+ "eval_f1": 0.6266846361185983,
1184
+ "eval_loss": 1.7196266651153564,
1185
+ "eval_precision": 0.5919042769857433,
1186
+ "eval_recall": 0.665807560137457,
1187
+ "eval_runtime": 2.8941,
1188
+ "eval_samples_per_second": 384.232,
1189
+ "eval_steps_per_second": 6.22,
1190
+ "step": 10736
1191
+ },
1192
+ {
1193
+ "epoch": 89.0,
1194
+ "eval_accuracy": 0.805375184897335,
1195
+ "eval_f1": 0.6272862755724156,
1196
+ "eval_loss": 1.6730009317398071,
1197
+ "eval_precision": 0.5952687066083826,
1198
+ "eval_recall": 0.6629438717067583,
1199
+ "eval_runtime": 2.9839,
1200
+ "eval_samples_per_second": 372.672,
1201
+ "eval_steps_per_second": 6.032,
1202
+ "step": 10858
1203
+ },
1204
+ {
1205
+ "epoch": 90.0,
1206
+ "eval_accuracy": 0.8023165442374709,
1207
+ "eval_f1": 0.6332482193253596,
1208
+ "eval_loss": 1.709200143814087,
1209
+ "eval_precision": 0.5966067358825019,
1210
+ "eval_recall": 0.6746849942726232,
1211
+ "eval_runtime": 2.9784,
1212
+ "eval_samples_per_second": 373.352,
1213
+ "eval_steps_per_second": 6.043,
1214
+ "step": 10980
1215
+ },
1216
+ {
1217
+ "epoch": 90.16,
1218
+ "learning_rate": 7.459016393442623e-06,
1219
+ "loss": 0.0037,
1220
+ "step": 11000
1221
+ },
1222
+ {
1223
+ "epoch": 91.0,
1224
+ "eval_accuracy": 0.8009627196831047,
1225
+ "eval_f1": 0.6340402392604676,
1226
+ "eval_loss": 1.7260353565216064,
1227
+ "eval_precision": 0.6035196687370601,
1228
+ "eval_recall": 0.6678121420389461,
1229
+ "eval_runtime": 2.8928,
1230
+ "eval_samples_per_second": 384.405,
1231
+ "eval_steps_per_second": 6.222,
1232
+ "step": 11102
1233
+ },
1234
+ {
1235
+ "epoch": 92.0,
1236
+ "eval_accuracy": 0.8030185273397348,
1237
+ "eval_f1": 0.631593220338983,
1238
+ "eval_loss": 1.71060311794281,
1239
+ "eval_precision": 0.5997939737316508,
1240
+ "eval_recall": 0.6669530355097365,
1241
+ "eval_runtime": 2.9325,
1242
+ "eval_samples_per_second": 379.201,
1243
+ "eval_steps_per_second": 6.138,
1244
+ "step": 11224
1245
+ },
1246
+ {
1247
+ "epoch": 93.0,
1248
+ "eval_accuracy": 0.8027928899140071,
1249
+ "eval_f1": 0.6377910124526259,
1250
+ "eval_loss": 1.709562063217163,
1251
+ "eval_precision": 0.6047227926078029,
1252
+ "eval_recall": 0.6746849942726232,
1253
+ "eval_runtime": 2.929,
1254
+ "eval_samples_per_second": 379.651,
1255
+ "eval_steps_per_second": 6.145,
1256
+ "step": 11346
1257
+ },
1258
+ {
1259
+ "epoch": 94.0,
1260
+ "eval_accuracy": 0.8009627196831047,
1261
+ "eval_f1": 0.6353984679478566,
1262
+ "eval_loss": 1.721982717514038,
1263
+ "eval_precision": 0.5986325652063814,
1264
+ "eval_recall": 0.6769759450171822,
1265
+ "eval_runtime": 2.8786,
1266
+ "eval_samples_per_second": 386.296,
1267
+ "eval_steps_per_second": 6.253,
1268
+ "step": 11468
1269
+ },
1270
+ {
1271
+ "epoch": 94.26,
1272
+ "learning_rate": 6.434426229508197e-06,
1273
+ "loss": 0.0032,
1274
+ "step": 11500
1275
+ },
1276
+ {
1277
+ "epoch": 95.0,
1278
+ "eval_accuracy": 0.7994333993531727,
1279
+ "eval_f1": 0.6351459951781409,
1280
+ "eval_loss": 1.7394192218780518,
1281
+ "eval_precision": 0.5966280825364871,
1282
+ "eval_recall": 0.6789805269186713,
1283
+ "eval_runtime": 2.9637,
1284
+ "eval_samples_per_second": 375.209,
1285
+ "eval_steps_per_second": 6.074,
1286
+ "step": 11590
1287
+ },
1288
+ {
1289
+ "epoch": 96.0,
1290
+ "eval_accuracy": 0.8004613031814877,
1291
+ "eval_f1": 0.6391640656805537,
1292
+ "eval_loss": 1.7256627082824707,
1293
+ "eval_precision": 0.6074284240392056,
1294
+ "eval_recall": 0.6743986254295533,
1295
+ "eval_runtime": 2.9246,
1296
+ "eval_samples_per_second": 380.221,
1297
+ "eval_steps_per_second": 6.155,
1298
+ "step": 11712
1299
+ },
1300
+ {
1301
+ "epoch": 97.0,
1302
+ "eval_accuracy": 0.8039461478677263,
1303
+ "eval_f1": 0.635028555887952,
1304
+ "eval_loss": 1.700808048248291,
1305
+ "eval_precision": 0.6046090108751943,
1306
+ "eval_recall": 0.6686712485681557,
1307
+ "eval_runtime": 2.8758,
1308
+ "eval_samples_per_second": 386.673,
1309
+ "eval_steps_per_second": 6.259,
1310
+ "step": 11834
1311
+ },
1312
+ {
1313
+ "epoch": 98.0,
1314
+ "eval_accuracy": 0.8032190939403816,
1315
+ "eval_f1": 0.6355140186915889,
1316
+ "eval_loss": 1.7482486963272095,
1317
+ "eval_precision": 0.6029298380878951,
1318
+ "eval_recall": 0.6718213058419243,
1319
+ "eval_runtime": 2.927,
1320
+ "eval_samples_per_second": 379.912,
1321
+ "eval_steps_per_second": 6.15,
1322
+ "step": 11956
1323
+ },
1324
+ {
1325
+ "epoch": 98.36,
1326
+ "learning_rate": 5.409836065573771e-06,
1327
+ "loss": 0.0028,
1328
+ "step": 12000
1329
+ },
1330
+ {
1331
+ "epoch": 99.0,
1332
+ "eval_accuracy": 0.8030435981648156,
1333
+ "eval_f1": 0.6319491410793994,
1334
+ "eval_loss": 1.7569934129714966,
1335
+ "eval_precision": 0.598820815175596,
1336
+ "eval_recall": 0.6689576174112256,
1337
+ "eval_runtime": 2.8862,
1338
+ "eval_samples_per_second": 385.277,
1339
+ "eval_steps_per_second": 6.236,
1340
+ "step": 12078
1341
+ },
1342
+ {
1343
+ "epoch": 100.0,
1344
+ "eval_accuracy": 0.8026173941384411,
1345
+ "eval_f1": 0.6335353535353535,
1346
+ "eval_loss": 1.733221411705017,
1347
+ "eval_precision": 0.5980167810831426,
1348
+ "eval_recall": 0.6735395189003437,
1349
+ "eval_runtime": 2.9532,
1350
+ "eval_samples_per_second": 376.543,
1351
+ "eval_steps_per_second": 6.095,
1352
+ "step": 12200
1353
+ },
1354
+ {
1355
+ "epoch": 101.0,
1356
+ "eval_accuracy": 0.8011131446335898,
1357
+ "eval_f1": 0.6279817743232377,
1358
+ "eval_loss": 1.749053955078125,
1359
+ "eval_precision": 0.590176322418136,
1360
+ "eval_recall": 0.6709621993127147,
1361
+ "eval_runtime": 2.8456,
1362
+ "eval_samples_per_second": 390.774,
1363
+ "eval_steps_per_second": 6.325,
1364
+ "step": 12322
1365
+ },
1366
+ {
1367
+ "epoch": 102.0,
1368
+ "eval_accuracy": 0.8033444480657859,
1369
+ "eval_f1": 0.6348178137651822,
1370
+ "eval_loss": 1.754211664199829,
1371
+ "eval_precision": 0.6003062787136294,
1372
+ "eval_recall": 0.6735395189003437,
1373
+ "eval_runtime": 3.0603,
1374
+ "eval_samples_per_second": 363.358,
1375
+ "eval_steps_per_second": 5.882,
1376
+ "step": 12444
1377
+ },
1378
+ {
1379
+ "epoch": 102.46,
1380
+ "learning_rate": 4.385245901639344e-06,
1381
+ "loss": 0.0021,
1382
+ "step": 12500
1383
+ },
1384
+ {
1385
+ "epoch": 103.0,
1386
+ "eval_accuracy": 0.8040464311680497,
1387
+ "eval_f1": 0.6305431998921687,
1388
+ "eval_loss": 1.7371126413345337,
1389
+ "eval_precision": 0.5956200662083015,
1390
+ "eval_recall": 0.6698167239404352,
1391
+ "eval_runtime": 2.9206,
1392
+ "eval_samples_per_second": 380.743,
1393
+ "eval_steps_per_second": 6.163,
1394
+ "step": 12566
1395
+ },
1396
+ {
1397
+ "epoch": 104.0,
1398
+ "eval_accuracy": 0.800737082257377,
1399
+ "eval_f1": 0.6273032952252858,
1400
+ "eval_loss": 1.771882176399231,
1401
+ "eval_precision": 0.5914278468171443,
1402
+ "eval_recall": 0.6678121420389461,
1403
+ "eval_runtime": 2.9022,
1404
+ "eval_samples_per_second": 383.156,
1405
+ "eval_steps_per_second": 6.202,
1406
+ "step": 12688
1407
+ },
1408
+ {
1409
+ "epoch": 105.0,
1410
+ "eval_accuracy": 0.804522776844586,
1411
+ "eval_f1": 0.6305525460455038,
1412
+ "eval_loss": 1.7473387718200684,
1413
+ "eval_precision": 0.5981500513874615,
1414
+ "eval_recall": 0.6666666666666666,
1415
+ "eval_runtime": 2.8767,
1416
+ "eval_samples_per_second": 386.556,
1417
+ "eval_steps_per_second": 6.257,
1418
+ "step": 12810
1419
+ },
1420
+ {
1421
+ "epoch": 106.0,
1422
+ "eval_accuracy": 0.803996289517888,
1423
+ "eval_f1": 0.6361556064073227,
1424
+ "eval_loss": 1.7518253326416016,
1425
+ "eval_precision": 0.6002032004064008,
1426
+ "eval_recall": 0.6766895761741123,
1427
+ "eval_runtime": 2.9045,
1428
+ "eval_samples_per_second": 382.852,
1429
+ "eval_steps_per_second": 6.197,
1430
+ "step": 12932
1431
+ },
1432
+ {
1433
+ "epoch": 106.56,
1434
+ "learning_rate": 3.3606557377049183e-06,
1435
+ "loss": 0.0019,
1436
+ "step": 13000
1437
+ },
1438
+ {
1439
+ "epoch": 107.0,
1440
+ "eval_accuracy": 0.804848697570637,
1441
+ "eval_f1": 0.6358241165362828,
1442
+ "eval_loss": 1.7628158330917358,
1443
+ "eval_precision": 0.6009688934217237,
1444
+ "eval_recall": 0.6749713631156931,
1445
+ "eval_runtime": 2.8942,
1446
+ "eval_samples_per_second": 384.222,
1447
+ "eval_steps_per_second": 6.219,
1448
+ "step": 13054
1449
+ },
1450
+ {
1451
+ "epoch": 108.0,
1452
+ "eval_accuracy": 0.7964750419936321,
1453
+ "eval_f1": 0.6344605475040258,
1454
+ "eval_loss": 1.8080339431762695,
1455
+ "eval_precision": 0.5969696969696969,
1456
+ "eval_recall": 0.6769759450171822,
1457
+ "eval_runtime": 2.9081,
1458
+ "eval_samples_per_second": 382.377,
1459
+ "eval_steps_per_second": 6.19,
1460
+ "step": 13176
1461
+ },
1462
+ {
1463
+ "epoch": 109.0,
1464
+ "eval_accuracy": 0.7985809913004237,
1465
+ "eval_f1": 0.6338519313304721,
1466
+ "eval_loss": 1.8027576208114624,
1467
+ "eval_precision": 0.5961150353178607,
1468
+ "eval_recall": 0.6766895761741123,
1469
+ "eval_runtime": 2.8859,
1470
+ "eval_samples_per_second": 385.324,
1471
+ "eval_steps_per_second": 6.237,
1472
+ "step": 13298
1473
+ },
1474
+ {
1475
+ "epoch": 110.0,
1476
+ "eval_accuracy": 0.8029683856895731,
1477
+ "eval_f1": 0.6342911102117901,
1478
+ "eval_loss": 1.781972050666809,
1479
+ "eval_precision": 0.5995919408314205,
1480
+ "eval_recall": 0.6732531500572738,
1481
+ "eval_runtime": 2.9186,
1482
+ "eval_samples_per_second": 380.999,
1483
+ "eval_steps_per_second": 6.167,
1484
+ "step": 13420
1485
+ },
1486
+ {
1487
+ "epoch": 110.66,
1488
+ "learning_rate": 2.336065573770492e-06,
1489
+ "loss": 0.0015,
1490
+ "step": 13500
1491
+ },
1492
+ {
1493
+ "epoch": 111.0,
1494
+ "eval_accuracy": 0.8029934565146539,
1495
+ "eval_f1": 0.6372588695534872,
1496
+ "eval_loss": 1.7890208959579468,
1497
+ "eval_precision": 0.6023973476154042,
1498
+ "eval_recall": 0.6764032073310424,
1499
+ "eval_runtime": 2.8911,
1500
+ "eval_samples_per_second": 384.624,
1501
+ "eval_steps_per_second": 6.226,
1502
+ "step": 13542
1503
+ },
1504
+ {
1505
+ "epoch": 112.0,
1506
+ "eval_accuracy": 0.8039712186928072,
1507
+ "eval_f1": 0.6397507112857337,
1508
+ "eval_loss": 1.7686277627944946,
1509
+ "eval_precision": 0.6070969400874261,
1510
+ "eval_recall": 0.6761168384879725,
1511
+ "eval_runtime": 2.9181,
1512
+ "eval_samples_per_second": 381.073,
1513
+ "eval_steps_per_second": 6.168,
1514
+ "step": 13664
1515
+ },
1516
+ {
1517
+ "epoch": 113.0,
1518
+ "eval_accuracy": 0.8046732017950711,
1519
+ "eval_f1": 0.6398059560706104,
1520
+ "eval_loss": 1.7624890804290771,
1521
+ "eval_precision": 0.6042249936370577,
1522
+ "eval_recall": 0.6798396334478809,
1523
+ "eval_runtime": 2.9383,
1524
+ "eval_samples_per_second": 378.455,
1525
+ "eval_steps_per_second": 6.126,
1526
+ "step": 13786
1527
+ },
1528
+ {
1529
+ "epoch": 114.0,
1530
+ "eval_accuracy": 0.8038960062175646,
1531
+ "eval_f1": 0.637657584383896,
1532
+ "eval_loss": 1.7636594772338867,
1533
+ "eval_precision": 0.6054054054054054,
1534
+ "eval_recall": 0.6735395189003437,
1535
+ "eval_runtime": 2.886,
1536
+ "eval_samples_per_second": 385.312,
1537
+ "eval_steps_per_second": 6.237,
1538
+ "step": 13908
1539
+ },
1540
+ {
1541
+ "epoch": 114.75,
1542
+ "learning_rate": 1.3114754098360657e-06,
1543
+ "loss": 0.0013,
1544
+ "step": 14000
1545
+ },
1546
+ {
1547
+ "epoch": 115.0,
1548
+ "eval_accuracy": 0.8037957229172412,
1549
+ "eval_f1": 0.638487508440243,
1550
+ "eval_loss": 1.7679991722106934,
1551
+ "eval_precision": 0.6041400460005111,
1552
+ "eval_recall": 0.6769759450171822,
1553
+ "eval_runtime": 2.8755,
1554
+ "eval_samples_per_second": 386.709,
1555
+ "eval_steps_per_second": 6.26,
1556
+ "step": 14030
1557
+ },
1558
+ {
1559
+ "epoch": 116.0,
1560
+ "eval_accuracy": 0.8028931732143305,
1561
+ "eval_f1": 0.6367303038451196,
1562
+ "eval_loss": 1.783056616783142,
1563
+ "eval_precision": 0.6001013684744044,
1564
+ "eval_recall": 0.6781214203894617,
1565
+ "eval_runtime": 2.9247,
1566
+ "eval_samples_per_second": 380.207,
1567
+ "eval_steps_per_second": 6.154,
1568
+ "step": 14152
1569
+ },
1570
+ {
1571
+ "epoch": 117.0,
1572
+ "eval_accuracy": 0.8021159776368241,
1573
+ "eval_f1": 0.6353479606945753,
1574
+ "eval_loss": 1.7854276895523071,
1575
+ "eval_precision": 0.5994411988823978,
1576
+ "eval_recall": 0.6758304696449027,
1577
+ "eval_runtime": 2.8934,
1578
+ "eval_samples_per_second": 384.319,
1579
+ "eval_steps_per_second": 6.221,
1580
+ "step": 14274
1581
+ },
1582
+ {
1583
+ "epoch": 118.0,
1584
+ "eval_accuracy": 0.8034447313661093,
1585
+ "eval_f1": 0.6355291576673866,
1586
+ "eval_loss": 1.7762141227722168,
1587
+ "eval_precision": 0.601123595505618,
1588
+ "eval_recall": 0.6741122565864834,
1589
+ "eval_runtime": 3.1021,
1590
+ "eval_samples_per_second": 358.473,
1591
+ "eval_steps_per_second": 5.803,
1592
+ "step": 14396
1593
+ },
1594
+ {
1595
+ "epoch": 118.85,
1596
+ "learning_rate": 2.8688524590163937e-07,
1597
+ "loss": 0.0012,
1598
+ "step": 14500
1599
+ },
1600
+ {
1601
+ "epoch": 119.0,
1602
+ "eval_accuracy": 0.8027928899140071,
1603
+ "eval_f1": 0.635909888034534,
1604
+ "eval_loss": 1.7778316736221313,
1605
+ "eval_precision": 0.6011221627135934,
1606
+ "eval_recall": 0.6749713631156931,
1607
+ "eval_runtime": 2.8654,
1608
+ "eval_samples_per_second": 388.078,
1609
+ "eval_steps_per_second": 6.282,
1610
+ "step": 14518
1611
+ },
1612
+ {
1613
+ "epoch": 120.0,
1614
+ "eval_accuracy": 0.8029182440394114,
1615
+ "eval_f1": 0.6360691144708422,
1616
+ "eval_loss": 1.777363896369934,
1617
+ "eval_precision": 0.6016343207354443,
1618
+ "eval_recall": 0.6746849942726232,
1619
+ "eval_runtime": 2.9475,
1620
+ "eval_samples_per_second": 377.274,
1621
+ "eval_steps_per_second": 6.107,
1622
+ "step": 14640
1623
+ },
1624
+ {
1625
+ "epoch": 120.0,
1626
+ "step": 14640,
1627
+ "total_flos": 1.220726808511488e+17,
1628
+ "train_loss": 0.10161643302465072,
1629
+ "train_runtime": 6791.1484,
1630
+ "train_samples_per_second": 137.561,
1631
+ "train_steps_per_second": 2.156
1632
+ }
1633
+ ],
1634
+ "max_steps": 14640,
1635
+ "num_train_epochs": 120,
1636
+ "total_flos": 1.220726808511488e+17,
1637
+ "trial_name": null,
1638
+ "trial_params": null
1639
+ }