stefan-it commited on
Commit
db0a7ad
·
1 Parent(s): a6a6e0a

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26b7c45e6466d0f10b02f0816f382b01f87676559ce467ed29a353fa38f3dfb5
3
+ size 440966725
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 20:47:41 0.0000 0.6106 0.1268 0.7079 0.7761 0.7404 0.6068
3
+ 2 20:48:47 0.0000 0.1224 0.1086 0.7629 0.7904 0.7764 0.6619
4
+ 3 20:49:50 0.0000 0.0739 0.1156 0.7978 0.8431 0.8198 0.7202
5
+ 4 20:50:52 0.0000 0.0485 0.1696 0.8086 0.8517 0.8296 0.7293
6
+ 5 20:51:54 0.0000 0.0361 0.1862 0.8186 0.8505 0.8343 0.7341
7
+ 6 20:52:57 0.0000 0.0252 0.1990 0.8367 0.8511 0.8438 0.7524
8
+ 7 20:53:59 0.0000 0.0194 0.2005 0.8495 0.8499 0.8497 0.7599
9
+ 8 20:55:02 0.0000 0.0118 0.1984 0.8475 0.8660 0.8567 0.7679
10
+ 9 20:56:04 0.0000 0.0079 0.2115 0.8604 0.8580 0.8592 0.7706
11
+ 10 20:57:07 0.0000 0.0057 0.2093 0.8519 0.8631 0.8575 0.7701
runs/events.out.tfevents.1697575603.bce904bcef33.2482.5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6c0aaf4191f227dd5b78ca0e2ecc7d30eade8a22e9a6fd2bc8bf9d963805af1
3
+ size 415388
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 20:46:43,325 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 20:46:43,326 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=21, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 20:46:43,329 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
48
+ - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
49
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 20:46:43,329 Train: 5901 sentences
51
+ 2023-10-17 20:46:43,329 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 20:46:43,329 Training Params:
54
+ 2023-10-17 20:46:43,329 - learning_rate: "5e-05"
55
+ 2023-10-17 20:46:43,329 - mini_batch_size: "8"
56
+ 2023-10-17 20:46:43,329 - max_epochs: "10"
57
+ 2023-10-17 20:46:43,329 - shuffle: "True"
58
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 20:46:43,329 Plugins:
60
+ 2023-10-17 20:46:43,329 - TensorboardLogger
61
+ 2023-10-17 20:46:43,329 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 20:46:43,329 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 20:46:43,329 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 20:46:43,329 Computation:
67
+ 2023-10-17 20:46:43,329 - compute on device: cuda:0
68
+ 2023-10-17 20:46:43,329 - embedding storage: none
69
+ 2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 20:46:43,329 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
71
+ 2023-10-17 20:46:43,330 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 20:46:43,330 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 20:46:43,330 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 20:46:48,122 epoch 1 - iter 73/738 - loss 3.27628587 - time (sec): 4.79 - samples/sec: 3348.31 - lr: 0.000005 - momentum: 0.000000
75
+ 2023-10-17 20:46:53,518 epoch 1 - iter 146/738 - loss 1.88031680 - time (sec): 10.19 - samples/sec: 3434.81 - lr: 0.000010 - momentum: 0.000000
76
+ 2023-10-17 20:46:58,077 epoch 1 - iter 219/738 - loss 1.45313720 - time (sec): 14.75 - samples/sec: 3382.83 - lr: 0.000015 - momentum: 0.000000
77
+ 2023-10-17 20:47:03,371 epoch 1 - iter 292/738 - loss 1.18325529 - time (sec): 20.04 - samples/sec: 3327.96 - lr: 0.000020 - momentum: 0.000000
78
+ 2023-10-17 20:47:08,279 epoch 1 - iter 365/738 - loss 1.01353206 - time (sec): 24.95 - samples/sec: 3318.60 - lr: 0.000025 - momentum: 0.000000
79
+ 2023-10-17 20:47:13,277 epoch 1 - iter 438/738 - loss 0.89228860 - time (sec): 29.95 - samples/sec: 3300.93 - lr: 0.000030 - momentum: 0.000000
80
+ 2023-10-17 20:47:18,383 epoch 1 - iter 511/738 - loss 0.79797751 - time (sec): 35.05 - samples/sec: 3292.44 - lr: 0.000035 - momentum: 0.000000
81
+ 2023-10-17 20:47:22,981 epoch 1 - iter 584/738 - loss 0.72397208 - time (sec): 39.65 - samples/sec: 3309.17 - lr: 0.000039 - momentum: 0.000000
82
+ 2023-10-17 20:47:28,169 epoch 1 - iter 657/738 - loss 0.66245224 - time (sec): 44.84 - samples/sec: 3301.83 - lr: 0.000044 - momentum: 0.000000
83
+ 2023-10-17 20:47:33,415 epoch 1 - iter 730/738 - loss 0.61711909 - time (sec): 50.08 - samples/sec: 3273.24 - lr: 0.000049 - momentum: 0.000000
84
+ 2023-10-17 20:47:34,237 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 20:47:34,237 EPOCH 1 done: loss 0.6106 - lr: 0.000049
86
+ 2023-10-17 20:47:41,563 DEV : loss 0.1268201768398285 - f1-score (micro avg) 0.7404
87
+ 2023-10-17 20:47:41,593 saving best model
88
+ 2023-10-17 20:47:41,993 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 20:47:46,689 epoch 2 - iter 73/738 - loss 0.12468557 - time (sec): 4.69 - samples/sec: 3176.81 - lr: 0.000049 - momentum: 0.000000
90
+ 2023-10-17 20:47:52,431 epoch 2 - iter 146/738 - loss 0.14402353 - time (sec): 10.44 - samples/sec: 3172.64 - lr: 0.000049 - momentum: 0.000000
91
+ 2023-10-17 20:47:57,788 epoch 2 - iter 219/738 - loss 0.14540226 - time (sec): 15.79 - samples/sec: 3176.56 - lr: 0.000048 - momentum: 0.000000
92
+ 2023-10-17 20:48:03,569 epoch 2 - iter 292/738 - loss 0.13897723 - time (sec): 21.57 - samples/sec: 3163.61 - lr: 0.000048 - momentum: 0.000000
93
+ 2023-10-17 20:48:09,446 epoch 2 - iter 365/738 - loss 0.13575281 - time (sec): 27.45 - samples/sec: 3135.71 - lr: 0.000047 - momentum: 0.000000
94
+ 2023-10-17 20:48:14,813 epoch 2 - iter 438/738 - loss 0.13129263 - time (sec): 32.82 - samples/sec: 3119.78 - lr: 0.000047 - momentum: 0.000000
95
+ 2023-10-17 20:48:20,077 epoch 2 - iter 511/738 - loss 0.12744292 - time (sec): 38.08 - samples/sec: 3088.08 - lr: 0.000046 - momentum: 0.000000
96
+ 2023-10-17 20:48:25,072 epoch 2 - iter 584/738 - loss 0.12567829 - time (sec): 43.08 - samples/sec: 3080.94 - lr: 0.000046 - momentum: 0.000000
97
+ 2023-10-17 20:48:30,070 epoch 2 - iter 657/738 - loss 0.12273838 - time (sec): 48.08 - samples/sec: 3087.93 - lr: 0.000045 - momentum: 0.000000
98
+ 2023-10-17 20:48:35,401 epoch 2 - iter 730/738 - loss 0.12256728 - time (sec): 53.41 - samples/sec: 3088.76 - lr: 0.000045 - momentum: 0.000000
99
+ 2023-10-17 20:48:35,899 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 20:48:35,899 EPOCH 2 done: loss 0.1224 - lr: 0.000045
101
+ 2023-10-17 20:48:47,359 DEV : loss 0.1086253672838211 - f1-score (micro avg) 0.7764
102
+ 2023-10-17 20:48:47,392 saving best model
103
+ 2023-10-17 20:48:47,962 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 20:48:52,731 epoch 3 - iter 73/738 - loss 0.07098121 - time (sec): 4.77 - samples/sec: 3112.70 - lr: 0.000044 - momentum: 0.000000
105
+ 2023-10-17 20:48:57,645 epoch 3 - iter 146/738 - loss 0.07973832 - time (sec): 9.68 - samples/sec: 3156.26 - lr: 0.000043 - momentum: 0.000000
106
+ 2023-10-17 20:49:02,953 epoch 3 - iter 219/738 - loss 0.07961799 - time (sec): 14.99 - samples/sec: 3169.33 - lr: 0.000043 - momentum: 0.000000
107
+ 2023-10-17 20:49:07,745 epoch 3 - iter 292/738 - loss 0.07850589 - time (sec): 19.78 - samples/sec: 3236.22 - lr: 0.000042 - momentum: 0.000000
108
+ 2023-10-17 20:49:12,567 epoch 3 - iter 365/738 - loss 0.07402743 - time (sec): 24.60 - samples/sec: 3265.52 - lr: 0.000042 - momentum: 0.000000
109
+ 2023-10-17 20:49:17,298 epoch 3 - iter 438/738 - loss 0.08001352 - time (sec): 29.33 - samples/sec: 3275.71 - lr: 0.000041 - momentum: 0.000000
110
+ 2023-10-17 20:49:23,094 epoch 3 - iter 511/738 - loss 0.07748337 - time (sec): 35.13 - samples/sec: 3272.26 - lr: 0.000041 - momentum: 0.000000
111
+ 2023-10-17 20:49:28,090 epoch 3 - iter 584/738 - loss 0.07564249 - time (sec): 40.12 - samples/sec: 3282.58 - lr: 0.000040 - momentum: 0.000000
112
+ 2023-10-17 20:49:33,175 epoch 3 - iter 657/738 - loss 0.07343420 - time (sec): 45.21 - samples/sec: 3285.72 - lr: 0.000040 - momentum: 0.000000
113
+ 2023-10-17 20:49:38,072 epoch 3 - iter 730/738 - loss 0.07366471 - time (sec): 50.11 - samples/sec: 3286.92 - lr: 0.000039 - momentum: 0.000000
114
+ 2023-10-17 20:49:38,600 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 20:49:38,601 EPOCH 3 done: loss 0.0739 - lr: 0.000039
116
+ 2023-10-17 20:49:49,997 DEV : loss 0.11555362492799759 - f1-score (micro avg) 0.8198
117
+ 2023-10-17 20:49:50,031 saving best model
118
+ 2023-10-17 20:49:50,487 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 20:49:55,834 epoch 4 - iter 73/738 - loss 0.04160062 - time (sec): 5.34 - samples/sec: 3175.79 - lr: 0.000038 - momentum: 0.000000
120
+ 2023-10-17 20:50:01,239 epoch 4 - iter 146/738 - loss 0.04439446 - time (sec): 10.74 - samples/sec: 3270.24 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-17 20:50:06,206 epoch 4 - iter 219/738 - loss 0.04370494 - time (sec): 15.71 - samples/sec: 3268.86 - lr: 0.000037 - momentum: 0.000000
122
+ 2023-10-17 20:50:11,401 epoch 4 - iter 292/738 - loss 0.04811555 - time (sec): 20.90 - samples/sec: 3253.28 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-17 20:50:16,097 epoch 4 - iter 365/738 - loss 0.04926683 - time (sec): 25.60 - samples/sec: 3256.26 - lr: 0.000036 - momentum: 0.000000
124
+ 2023-10-17 20:50:20,727 epoch 4 - iter 438/738 - loss 0.04889972 - time (sec): 30.23 - samples/sec: 3275.71 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-17 20:50:25,568 epoch 4 - iter 511/738 - loss 0.04922740 - time (sec): 35.07 - samples/sec: 3286.15 - lr: 0.000035 - momentum: 0.000000
126
+ 2023-10-17 20:50:30,582 epoch 4 - iter 584/738 - loss 0.04769097 - time (sec): 40.09 - samples/sec: 3295.61 - lr: 0.000035 - momentum: 0.000000
127
+ 2023-10-17 20:50:35,683 epoch 4 - iter 657/738 - loss 0.04770829 - time (sec): 45.19 - samples/sec: 3308.46 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-17 20:50:40,337 epoch 4 - iter 730/738 - loss 0.04843947 - time (sec): 49.84 - samples/sec: 3304.22 - lr: 0.000033 - momentum: 0.000000
129
+ 2023-10-17 20:50:40,884 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 20:50:40,885 EPOCH 4 done: loss 0.0485 - lr: 0.000033
131
+ 2023-10-17 20:50:52,292 DEV : loss 0.1695607602596283 - f1-score (micro avg) 0.8296
132
+ 2023-10-17 20:50:52,322 saving best model
133
+ 2023-10-17 20:50:52,822 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-17 20:50:58,011 epoch 5 - iter 73/738 - loss 0.03608107 - time (sec): 5.19 - samples/sec: 3363.48 - lr: 0.000033 - momentum: 0.000000
135
+ 2023-10-17 20:51:02,971 epoch 5 - iter 146/738 - loss 0.03199147 - time (sec): 10.15 - samples/sec: 3357.55 - lr: 0.000032 - momentum: 0.000000
136
+ 2023-10-17 20:51:08,337 epoch 5 - iter 219/738 - loss 0.03313355 - time (sec): 15.51 - samples/sec: 3354.59 - lr: 0.000032 - momentum: 0.000000
137
+ 2023-10-17 20:51:13,302 epoch 5 - iter 292/738 - loss 0.03443954 - time (sec): 20.48 - samples/sec: 3348.04 - lr: 0.000031 - momentum: 0.000000
138
+ 2023-10-17 20:51:18,206 epoch 5 - iter 365/738 - loss 0.03367944 - time (sec): 25.38 - samples/sec: 3362.98 - lr: 0.000031 - momentum: 0.000000
139
+ 2023-10-17 20:51:23,370 epoch 5 - iter 438/738 - loss 0.03267726 - time (sec): 30.55 - samples/sec: 3345.35 - lr: 0.000030 - momentum: 0.000000
140
+ 2023-10-17 20:51:27,818 epoch 5 - iter 511/738 - loss 0.03446430 - time (sec): 34.99 - samples/sec: 3341.96 - lr: 0.000030 - momentum: 0.000000
141
+ 2023-10-17 20:51:32,447 epoch 5 - iter 584/738 - loss 0.03493802 - time (sec): 39.62 - samples/sec: 3334.16 - lr: 0.000029 - momentum: 0.000000
142
+ 2023-10-17 20:51:37,619 epoch 5 - iter 657/738 - loss 0.03468427 - time (sec): 44.80 - samples/sec: 3305.72 - lr: 0.000028 - momentum: 0.000000
143
+ 2023-10-17 20:51:42,797 epoch 5 - iter 730/738 - loss 0.03596301 - time (sec): 49.97 - samples/sec: 3302.27 - lr: 0.000028 - momentum: 0.000000
144
+ 2023-10-17 20:51:43,259 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-17 20:51:43,259 EPOCH 5 done: loss 0.0361 - lr: 0.000028
146
+ 2023-10-17 20:51:54,959 DEV : loss 0.18621283769607544 - f1-score (micro avg) 0.8343
147
+ 2023-10-17 20:51:54,989 saving best model
148
+ 2023-10-17 20:51:55,450 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-17 20:52:00,421 epoch 6 - iter 73/738 - loss 0.03314020 - time (sec): 4.97 - samples/sec: 3267.43 - lr: 0.000027 - momentum: 0.000000
150
+ 2023-10-17 20:52:05,581 epoch 6 - iter 146/738 - loss 0.03144585 - time (sec): 10.13 - samples/sec: 3177.13 - lr: 0.000027 - momentum: 0.000000
151
+ 2023-10-17 20:52:10,154 epoch 6 - iter 219/738 - loss 0.02846193 - time (sec): 14.70 - samples/sec: 3204.07 - lr: 0.000026 - momentum: 0.000000
152
+ 2023-10-17 20:52:15,181 epoch 6 - iter 292/738 - loss 0.02617097 - time (sec): 19.73 - samples/sec: 3233.54 - lr: 0.000026 - momentum: 0.000000
153
+ 2023-10-17 20:52:20,401 epoch 6 - iter 365/738 - loss 0.02459012 - time (sec): 24.95 - samples/sec: 3217.27 - lr: 0.000025 - momentum: 0.000000
154
+ 2023-10-17 20:52:24,933 epoch 6 - iter 438/738 - loss 0.02420569 - time (sec): 29.48 - samples/sec: 3251.94 - lr: 0.000025 - momentum: 0.000000
155
+ 2023-10-17 20:52:29,485 epoch 6 - iter 511/738 - loss 0.02455905 - time (sec): 34.03 - samples/sec: 3278.27 - lr: 0.000024 - momentum: 0.000000
156
+ 2023-10-17 20:52:34,344 epoch 6 - iter 584/738 - loss 0.02479615 - time (sec): 38.89 - samples/sec: 3268.69 - lr: 0.000023 - momentum: 0.000000
157
+ 2023-10-17 20:52:40,085 epoch 6 - iter 657/738 - loss 0.02602451 - time (sec): 44.63 - samples/sec: 3301.26 - lr: 0.000023 - momentum: 0.000000
158
+ 2023-10-17 20:52:45,132 epoch 6 - iter 730/738 - loss 0.02514643 - time (sec): 49.68 - samples/sec: 3302.41 - lr: 0.000022 - momentum: 0.000000
159
+ 2023-10-17 20:52:45,851 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-17 20:52:45,852 EPOCH 6 done: loss 0.0252 - lr: 0.000022
161
+ 2023-10-17 20:52:57,524 DEV : loss 0.19898028671741486 - f1-score (micro avg) 0.8438
162
+ 2023-10-17 20:52:57,557 saving best model
163
+ 2023-10-17 20:52:58,043 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-17 20:53:02,903 epoch 7 - iter 73/738 - loss 0.01475665 - time (sec): 4.86 - samples/sec: 3133.95 - lr: 0.000022 - momentum: 0.000000
165
+ 2023-10-17 20:53:07,464 epoch 7 - iter 146/738 - loss 0.01079426 - time (sec): 9.42 - samples/sec: 3313.04 - lr: 0.000021 - momentum: 0.000000
166
+ 2023-10-17 20:53:12,052 epoch 7 - iter 219/738 - loss 0.01197725 - time (sec): 14.01 - samples/sec: 3264.54 - lr: 0.000021 - momentum: 0.000000
167
+ 2023-10-17 20:53:16,929 epoch 7 - iter 292/738 - loss 0.01292926 - time (sec): 18.88 - samples/sec: 3284.79 - lr: 0.000020 - momentum: 0.000000
168
+ 2023-10-17 20:53:21,733 epoch 7 - iter 365/738 - loss 0.01589341 - time (sec): 23.69 - samples/sec: 3296.02 - lr: 0.000020 - momentum: 0.000000
169
+ 2023-10-17 20:53:26,782 epoch 7 - iter 438/738 - loss 0.01627966 - time (sec): 28.74 - samples/sec: 3340.14 - lr: 0.000019 - momentum: 0.000000
170
+ 2023-10-17 20:53:32,829 epoch 7 - iter 511/738 - loss 0.01882907 - time (sec): 34.79 - samples/sec: 3352.44 - lr: 0.000018 - momentum: 0.000000
171
+ 2023-10-17 20:53:37,503 epoch 7 - iter 584/738 - loss 0.01875664 - time (sec): 39.46 - samples/sec: 3353.64 - lr: 0.000018 - momentum: 0.000000
172
+ 2023-10-17 20:53:42,403 epoch 7 - iter 657/738 - loss 0.01898628 - time (sec): 44.36 - samples/sec: 3350.33 - lr: 0.000017 - momentum: 0.000000
173
+ 2023-10-17 20:53:47,332 epoch 7 - iter 730/738 - loss 0.01903040 - time (sec): 49.29 - samples/sec: 3341.12 - lr: 0.000017 - momentum: 0.000000
174
+ 2023-10-17 20:53:47,894 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-17 20:53:47,894 EPOCH 7 done: loss 0.0194 - lr: 0.000017
176
+ 2023-10-17 20:53:59,398 DEV : loss 0.2004670649766922 - f1-score (micro avg) 0.8497
177
+ 2023-10-17 20:53:59,432 saving best model
178
+ 2023-10-17 20:53:59,916 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-17 20:54:04,698 epoch 8 - iter 73/738 - loss 0.01638856 - time (sec): 4.78 - samples/sec: 3252.98 - lr: 0.000016 - momentum: 0.000000
180
+ 2023-10-17 20:54:09,704 epoch 8 - iter 146/738 - loss 0.01268644 - time (sec): 9.79 - samples/sec: 3215.86 - lr: 0.000016 - momentum: 0.000000
181
+ 2023-10-17 20:54:14,523 epoch 8 - iter 219/738 - loss 0.01154489 - time (sec): 14.61 - samples/sec: 3238.88 - lr: 0.000015 - momentum: 0.000000
182
+ 2023-10-17 20:54:19,163 epoch 8 - iter 292/738 - loss 0.01121476 - time (sec): 19.25 - samples/sec: 3253.95 - lr: 0.000015 - momentum: 0.000000
183
+ 2023-10-17 20:54:25,326 epoch 8 - iter 365/738 - loss 0.01237795 - time (sec): 25.41 - samples/sec: 3256.44 - lr: 0.000014 - momentum: 0.000000
184
+ 2023-10-17 20:54:31,141 epoch 8 - iter 438/738 - loss 0.01216074 - time (sec): 31.22 - samples/sec: 3254.28 - lr: 0.000013 - momentum: 0.000000
185
+ 2023-10-17 20:54:36,019 epoch 8 - iter 511/738 - loss 0.01129103 - time (sec): 36.10 - samples/sec: 3253.41 - lr: 0.000013 - momentum: 0.000000
186
+ 2023-10-17 20:54:41,054 epoch 8 - iter 584/738 - loss 0.01142640 - time (sec): 41.14 - samples/sec: 3261.75 - lr: 0.000012 - momentum: 0.000000
187
+ 2023-10-17 20:54:45,906 epoch 8 - iter 657/738 - loss 0.01201789 - time (sec): 45.99 - samples/sec: 3247.76 - lr: 0.000012 - momentum: 0.000000
188
+ 2023-10-17 20:54:50,232 epoch 8 - iter 730/738 - loss 0.01177942 - time (sec): 50.31 - samples/sec: 3270.00 - lr: 0.000011 - momentum: 0.000000
189
+ 2023-10-17 20:54:50,766 ----------------------------------------------------------------------------------------------------
190
+ 2023-10-17 20:54:50,767 EPOCH 8 done: loss 0.0118 - lr: 0.000011
191
+ 2023-10-17 20:55:02,150 DEV : loss 0.198430597782135 - f1-score (micro avg) 0.8567
192
+ 2023-10-17 20:55:02,181 saving best model
193
+ 2023-10-17 20:55:02,675 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-17 20:55:07,868 epoch 9 - iter 73/738 - loss 0.01519568 - time (sec): 5.19 - samples/sec: 3455.39 - lr: 0.000011 - momentum: 0.000000
195
+ 2023-10-17 20:55:12,960 epoch 9 - iter 146/738 - loss 0.00937816 - time (sec): 10.28 - samples/sec: 3348.66 - lr: 0.000010 - momentum: 0.000000
196
+ 2023-10-17 20:55:17,958 epoch 9 - iter 219/738 - loss 0.00848427 - time (sec): 15.28 - samples/sec: 3265.03 - lr: 0.000010 - momentum: 0.000000
197
+ 2023-10-17 20:55:23,209 epoch 9 - iter 292/738 - loss 0.00799314 - time (sec): 20.53 - samples/sec: 3258.21 - lr: 0.000009 - momentum: 0.000000
198
+ 2023-10-17 20:55:27,918 epoch 9 - iter 365/738 - loss 0.00807520 - time (sec): 25.24 - samples/sec: 3272.15 - lr: 0.000008 - momentum: 0.000000
199
+ 2023-10-17 20:55:33,094 epoch 9 - iter 438/738 - loss 0.00784412 - time (sec): 30.42 - samples/sec: 3286.92 - lr: 0.000008 - momentum: 0.000000
200
+ 2023-10-17 20:55:38,313 epoch 9 - iter 511/738 - loss 0.00889544 - time (sec): 35.64 - samples/sec: 3274.78 - lr: 0.000007 - momentum: 0.000000
201
+ 2023-10-17 20:55:43,202 epoch 9 - iter 584/738 - loss 0.00926012 - time (sec): 40.52 - samples/sec: 3281.89 - lr: 0.000007 - momentum: 0.000000
202
+ 2023-10-17 20:55:48,350 epoch 9 - iter 657/738 - loss 0.00863704 - time (sec): 45.67 - samples/sec: 3283.07 - lr: 0.000006 - momentum: 0.000000
203
+ 2023-10-17 20:55:52,805 epoch 9 - iter 730/738 - loss 0.00798481 - time (sec): 50.13 - samples/sec: 3290.70 - lr: 0.000006 - momentum: 0.000000
204
+ 2023-10-17 20:55:53,313 ----------------------------------------------------------------------------------------------------
205
+ 2023-10-17 20:55:53,313 EPOCH 9 done: loss 0.0079 - lr: 0.000006
206
+ 2023-10-17 20:56:04,772 DEV : loss 0.2114827036857605 - f1-score (micro avg) 0.8592
207
+ 2023-10-17 20:56:04,806 saving best model
208
+ 2023-10-17 20:56:05,225 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-17 20:56:10,423 epoch 10 - iter 73/738 - loss 0.00697079 - time (sec): 5.20 - samples/sec: 3137.08 - lr: 0.000005 - momentum: 0.000000
210
+ 2023-10-17 20:56:16,026 epoch 10 - iter 146/738 - loss 0.00806024 - time (sec): 10.80 - samples/sec: 3238.10 - lr: 0.000004 - momentum: 0.000000
211
+ 2023-10-17 20:56:20,994 epoch 10 - iter 219/738 - loss 0.00854713 - time (sec): 15.77 - samples/sec: 3195.31 - lr: 0.000004 - momentum: 0.000000
212
+ 2023-10-17 20:56:26,151 epoch 10 - iter 292/738 - loss 0.00725008 - time (sec): 20.92 - samples/sec: 3224.88 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-17 20:56:30,811 epoch 10 - iter 365/738 - loss 0.00633495 - time (sec): 25.58 - samples/sec: 3248.59 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-17 20:56:35,376 epoch 10 - iter 438/738 - loss 0.00772199 - time (sec): 30.15 - samples/sec: 3274.22 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-17 20:56:40,499 epoch 10 - iter 511/738 - loss 0.00693419 - time (sec): 35.27 - samples/sec: 3250.26 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-17 20:56:45,335 epoch 10 - iter 584/738 - loss 0.00661804 - time (sec): 40.11 - samples/sec: 3269.64 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-17 20:56:50,238 epoch 10 - iter 657/738 - loss 0.00621935 - time (sec): 45.01 - samples/sec: 3277.65 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-17 20:56:55,354 epoch 10 - iter 730/738 - loss 0.00579273 - time (sec): 50.13 - samples/sec: 3289.07 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-10-17 20:56:55,849 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-17 20:56:55,849 EPOCH 10 done: loss 0.0057 - lr: 0.000000
221
+ 2023-10-17 20:57:07,458 DEV : loss 0.2092556357383728 - f1-score (micro avg) 0.8575
222
+ 2023-10-17 20:57:07,845 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-17 20:57:07,846 Loading model from best epoch ...
224
+ 2023-10-17 20:57:09,560 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
225
+ 2023-10-17 20:57:15,748
226
+ Results:
227
+ - F-score (micro) 0.8061
228
+ - F-score (macro) 0.7226
229
+ - Accuracy 0.6955
230
+
231
+ By class:
232
+ precision recall f1-score support
233
+
234
+ loc 0.8573 0.8753 0.8662 858
235
+ pers 0.7504 0.8175 0.7825 537
236
+ org 0.6579 0.5682 0.6098 132
237
+ time 0.5806 0.6667 0.6207 54
238
+ prod 0.8333 0.6557 0.7339 61
239
+
240
+ micro avg 0.7958 0.8167 0.8061 1642
241
+ macro avg 0.7359 0.7167 0.7226 1642
242
+ weighted avg 0.7963 0.8167 0.8052 1642
243
+
244
+ 2023-10-17 20:57:15,748 ----------------------------------------------------------------------------------------------------