Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697575603.bce904bcef33.2482.5 +3 -0
- test.tsv +0 -0
- training.log +244 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:26b7c45e6466d0f10b02f0816f382b01f87676559ce467ed29a353fa38f3dfb5
|
3 |
+
size 440966725
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 20:47:41 0.0000 0.6106 0.1268 0.7079 0.7761 0.7404 0.6068
|
3 |
+
2 20:48:47 0.0000 0.1224 0.1086 0.7629 0.7904 0.7764 0.6619
|
4 |
+
3 20:49:50 0.0000 0.0739 0.1156 0.7978 0.8431 0.8198 0.7202
|
5 |
+
4 20:50:52 0.0000 0.0485 0.1696 0.8086 0.8517 0.8296 0.7293
|
6 |
+
5 20:51:54 0.0000 0.0361 0.1862 0.8186 0.8505 0.8343 0.7341
|
7 |
+
6 20:52:57 0.0000 0.0252 0.1990 0.8367 0.8511 0.8438 0.7524
|
8 |
+
7 20:53:59 0.0000 0.0194 0.2005 0.8495 0.8499 0.8497 0.7599
|
9 |
+
8 20:55:02 0.0000 0.0118 0.1984 0.8475 0.8660 0.8567 0.7679
|
10 |
+
9 20:56:04 0.0000 0.0079 0.2115 0.8604 0.8580 0.8592 0.7706
|
11 |
+
10 20:57:07 0.0000 0.0057 0.2093 0.8519 0.8631 0.8575 0.7701
|
runs/events.out.tfevents.1697575603.bce904bcef33.2482.5
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e6c0aaf4191f227dd5b78ca0e2ecc7d30eade8a22e9a6fd2bc8bf9d963805af1
|
3 |
+
size 415388
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,244 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 20:46:43,325 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 20:46:43,326 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 20:46:43,329 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
|
48 |
+
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
|
49 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 20:46:43,329 Train: 5901 sentences
|
51 |
+
2023-10-17 20:46:43,329 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 20:46:43,329 Training Params:
|
54 |
+
2023-10-17 20:46:43,329 - learning_rate: "5e-05"
|
55 |
+
2023-10-17 20:46:43,329 - mini_batch_size: "8"
|
56 |
+
2023-10-17 20:46:43,329 - max_epochs: "10"
|
57 |
+
2023-10-17 20:46:43,329 - shuffle: "True"
|
58 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 20:46:43,329 Plugins:
|
60 |
+
2023-10-17 20:46:43,329 - TensorboardLogger
|
61 |
+
2023-10-17 20:46:43,329 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 20:46:43,329 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 20:46:43,329 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 20:46:43,329 Computation:
|
67 |
+
2023-10-17 20:46:43,329 - compute on device: cuda:0
|
68 |
+
2023-10-17 20:46:43,329 - embedding storage: none
|
69 |
+
2023-10-17 20:46:43,329 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 20:46:43,329 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
|
71 |
+
2023-10-17 20:46:43,330 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 20:46:43,330 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 20:46:43,330 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 20:46:48,122 epoch 1 - iter 73/738 - loss 3.27628587 - time (sec): 4.79 - samples/sec: 3348.31 - lr: 0.000005 - momentum: 0.000000
|
75 |
+
2023-10-17 20:46:53,518 epoch 1 - iter 146/738 - loss 1.88031680 - time (sec): 10.19 - samples/sec: 3434.81 - lr: 0.000010 - momentum: 0.000000
|
76 |
+
2023-10-17 20:46:58,077 epoch 1 - iter 219/738 - loss 1.45313720 - time (sec): 14.75 - samples/sec: 3382.83 - lr: 0.000015 - momentum: 0.000000
|
77 |
+
2023-10-17 20:47:03,371 epoch 1 - iter 292/738 - loss 1.18325529 - time (sec): 20.04 - samples/sec: 3327.96 - lr: 0.000020 - momentum: 0.000000
|
78 |
+
2023-10-17 20:47:08,279 epoch 1 - iter 365/738 - loss 1.01353206 - time (sec): 24.95 - samples/sec: 3318.60 - lr: 0.000025 - momentum: 0.000000
|
79 |
+
2023-10-17 20:47:13,277 epoch 1 - iter 438/738 - loss 0.89228860 - time (sec): 29.95 - samples/sec: 3300.93 - lr: 0.000030 - momentum: 0.000000
|
80 |
+
2023-10-17 20:47:18,383 epoch 1 - iter 511/738 - loss 0.79797751 - time (sec): 35.05 - samples/sec: 3292.44 - lr: 0.000035 - momentum: 0.000000
|
81 |
+
2023-10-17 20:47:22,981 epoch 1 - iter 584/738 - loss 0.72397208 - time (sec): 39.65 - samples/sec: 3309.17 - lr: 0.000039 - momentum: 0.000000
|
82 |
+
2023-10-17 20:47:28,169 epoch 1 - iter 657/738 - loss 0.66245224 - time (sec): 44.84 - samples/sec: 3301.83 - lr: 0.000044 - momentum: 0.000000
|
83 |
+
2023-10-17 20:47:33,415 epoch 1 - iter 730/738 - loss 0.61711909 - time (sec): 50.08 - samples/sec: 3273.24 - lr: 0.000049 - momentum: 0.000000
|
84 |
+
2023-10-17 20:47:34,237 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 20:47:34,237 EPOCH 1 done: loss 0.6106 - lr: 0.000049
|
86 |
+
2023-10-17 20:47:41,563 DEV : loss 0.1268201768398285 - f1-score (micro avg) 0.7404
|
87 |
+
2023-10-17 20:47:41,593 saving best model
|
88 |
+
2023-10-17 20:47:41,993 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 20:47:46,689 epoch 2 - iter 73/738 - loss 0.12468557 - time (sec): 4.69 - samples/sec: 3176.81 - lr: 0.000049 - momentum: 0.000000
|
90 |
+
2023-10-17 20:47:52,431 epoch 2 - iter 146/738 - loss 0.14402353 - time (sec): 10.44 - samples/sec: 3172.64 - lr: 0.000049 - momentum: 0.000000
|
91 |
+
2023-10-17 20:47:57,788 epoch 2 - iter 219/738 - loss 0.14540226 - time (sec): 15.79 - samples/sec: 3176.56 - lr: 0.000048 - momentum: 0.000000
|
92 |
+
2023-10-17 20:48:03,569 epoch 2 - iter 292/738 - loss 0.13897723 - time (sec): 21.57 - samples/sec: 3163.61 - lr: 0.000048 - momentum: 0.000000
|
93 |
+
2023-10-17 20:48:09,446 epoch 2 - iter 365/738 - loss 0.13575281 - time (sec): 27.45 - samples/sec: 3135.71 - lr: 0.000047 - momentum: 0.000000
|
94 |
+
2023-10-17 20:48:14,813 epoch 2 - iter 438/738 - loss 0.13129263 - time (sec): 32.82 - samples/sec: 3119.78 - lr: 0.000047 - momentum: 0.000000
|
95 |
+
2023-10-17 20:48:20,077 epoch 2 - iter 511/738 - loss 0.12744292 - time (sec): 38.08 - samples/sec: 3088.08 - lr: 0.000046 - momentum: 0.000000
|
96 |
+
2023-10-17 20:48:25,072 epoch 2 - iter 584/738 - loss 0.12567829 - time (sec): 43.08 - samples/sec: 3080.94 - lr: 0.000046 - momentum: 0.000000
|
97 |
+
2023-10-17 20:48:30,070 epoch 2 - iter 657/738 - loss 0.12273838 - time (sec): 48.08 - samples/sec: 3087.93 - lr: 0.000045 - momentum: 0.000000
|
98 |
+
2023-10-17 20:48:35,401 epoch 2 - iter 730/738 - loss 0.12256728 - time (sec): 53.41 - samples/sec: 3088.76 - lr: 0.000045 - momentum: 0.000000
|
99 |
+
2023-10-17 20:48:35,899 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 20:48:35,899 EPOCH 2 done: loss 0.1224 - lr: 0.000045
|
101 |
+
2023-10-17 20:48:47,359 DEV : loss 0.1086253672838211 - f1-score (micro avg) 0.7764
|
102 |
+
2023-10-17 20:48:47,392 saving best model
|
103 |
+
2023-10-17 20:48:47,962 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 20:48:52,731 epoch 3 - iter 73/738 - loss 0.07098121 - time (sec): 4.77 - samples/sec: 3112.70 - lr: 0.000044 - momentum: 0.000000
|
105 |
+
2023-10-17 20:48:57,645 epoch 3 - iter 146/738 - loss 0.07973832 - time (sec): 9.68 - samples/sec: 3156.26 - lr: 0.000043 - momentum: 0.000000
|
106 |
+
2023-10-17 20:49:02,953 epoch 3 - iter 219/738 - loss 0.07961799 - time (sec): 14.99 - samples/sec: 3169.33 - lr: 0.000043 - momentum: 0.000000
|
107 |
+
2023-10-17 20:49:07,745 epoch 3 - iter 292/738 - loss 0.07850589 - time (sec): 19.78 - samples/sec: 3236.22 - lr: 0.000042 - momentum: 0.000000
|
108 |
+
2023-10-17 20:49:12,567 epoch 3 - iter 365/738 - loss 0.07402743 - time (sec): 24.60 - samples/sec: 3265.52 - lr: 0.000042 - momentum: 0.000000
|
109 |
+
2023-10-17 20:49:17,298 epoch 3 - iter 438/738 - loss 0.08001352 - time (sec): 29.33 - samples/sec: 3275.71 - lr: 0.000041 - momentum: 0.000000
|
110 |
+
2023-10-17 20:49:23,094 epoch 3 - iter 511/738 - loss 0.07748337 - time (sec): 35.13 - samples/sec: 3272.26 - lr: 0.000041 - momentum: 0.000000
|
111 |
+
2023-10-17 20:49:28,090 epoch 3 - iter 584/738 - loss 0.07564249 - time (sec): 40.12 - samples/sec: 3282.58 - lr: 0.000040 - momentum: 0.000000
|
112 |
+
2023-10-17 20:49:33,175 epoch 3 - iter 657/738 - loss 0.07343420 - time (sec): 45.21 - samples/sec: 3285.72 - lr: 0.000040 - momentum: 0.000000
|
113 |
+
2023-10-17 20:49:38,072 epoch 3 - iter 730/738 - loss 0.07366471 - time (sec): 50.11 - samples/sec: 3286.92 - lr: 0.000039 - momentum: 0.000000
|
114 |
+
2023-10-17 20:49:38,600 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 20:49:38,601 EPOCH 3 done: loss 0.0739 - lr: 0.000039
|
116 |
+
2023-10-17 20:49:49,997 DEV : loss 0.11555362492799759 - f1-score (micro avg) 0.8198
|
117 |
+
2023-10-17 20:49:50,031 saving best model
|
118 |
+
2023-10-17 20:49:50,487 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 20:49:55,834 epoch 4 - iter 73/738 - loss 0.04160062 - time (sec): 5.34 - samples/sec: 3175.79 - lr: 0.000038 - momentum: 0.000000
|
120 |
+
2023-10-17 20:50:01,239 epoch 4 - iter 146/738 - loss 0.04439446 - time (sec): 10.74 - samples/sec: 3270.24 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-17 20:50:06,206 epoch 4 - iter 219/738 - loss 0.04370494 - time (sec): 15.71 - samples/sec: 3268.86 - lr: 0.000037 - momentum: 0.000000
|
122 |
+
2023-10-17 20:50:11,401 epoch 4 - iter 292/738 - loss 0.04811555 - time (sec): 20.90 - samples/sec: 3253.28 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-17 20:50:16,097 epoch 4 - iter 365/738 - loss 0.04926683 - time (sec): 25.60 - samples/sec: 3256.26 - lr: 0.000036 - momentum: 0.000000
|
124 |
+
2023-10-17 20:50:20,727 epoch 4 - iter 438/738 - loss 0.04889972 - time (sec): 30.23 - samples/sec: 3275.71 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-17 20:50:25,568 epoch 4 - iter 511/738 - loss 0.04922740 - time (sec): 35.07 - samples/sec: 3286.15 - lr: 0.000035 - momentum: 0.000000
|
126 |
+
2023-10-17 20:50:30,582 epoch 4 - iter 584/738 - loss 0.04769097 - time (sec): 40.09 - samples/sec: 3295.61 - lr: 0.000035 - momentum: 0.000000
|
127 |
+
2023-10-17 20:50:35,683 epoch 4 - iter 657/738 - loss 0.04770829 - time (sec): 45.19 - samples/sec: 3308.46 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-17 20:50:40,337 epoch 4 - iter 730/738 - loss 0.04843947 - time (sec): 49.84 - samples/sec: 3304.22 - lr: 0.000033 - momentum: 0.000000
|
129 |
+
2023-10-17 20:50:40,884 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 20:50:40,885 EPOCH 4 done: loss 0.0485 - lr: 0.000033
|
131 |
+
2023-10-17 20:50:52,292 DEV : loss 0.1695607602596283 - f1-score (micro avg) 0.8296
|
132 |
+
2023-10-17 20:50:52,322 saving best model
|
133 |
+
2023-10-17 20:50:52,822 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-17 20:50:58,011 epoch 5 - iter 73/738 - loss 0.03608107 - time (sec): 5.19 - samples/sec: 3363.48 - lr: 0.000033 - momentum: 0.000000
|
135 |
+
2023-10-17 20:51:02,971 epoch 5 - iter 146/738 - loss 0.03199147 - time (sec): 10.15 - samples/sec: 3357.55 - lr: 0.000032 - momentum: 0.000000
|
136 |
+
2023-10-17 20:51:08,337 epoch 5 - iter 219/738 - loss 0.03313355 - time (sec): 15.51 - samples/sec: 3354.59 - lr: 0.000032 - momentum: 0.000000
|
137 |
+
2023-10-17 20:51:13,302 epoch 5 - iter 292/738 - loss 0.03443954 - time (sec): 20.48 - samples/sec: 3348.04 - lr: 0.000031 - momentum: 0.000000
|
138 |
+
2023-10-17 20:51:18,206 epoch 5 - iter 365/738 - loss 0.03367944 - time (sec): 25.38 - samples/sec: 3362.98 - lr: 0.000031 - momentum: 0.000000
|
139 |
+
2023-10-17 20:51:23,370 epoch 5 - iter 438/738 - loss 0.03267726 - time (sec): 30.55 - samples/sec: 3345.35 - lr: 0.000030 - momentum: 0.000000
|
140 |
+
2023-10-17 20:51:27,818 epoch 5 - iter 511/738 - loss 0.03446430 - time (sec): 34.99 - samples/sec: 3341.96 - lr: 0.000030 - momentum: 0.000000
|
141 |
+
2023-10-17 20:51:32,447 epoch 5 - iter 584/738 - loss 0.03493802 - time (sec): 39.62 - samples/sec: 3334.16 - lr: 0.000029 - momentum: 0.000000
|
142 |
+
2023-10-17 20:51:37,619 epoch 5 - iter 657/738 - loss 0.03468427 - time (sec): 44.80 - samples/sec: 3305.72 - lr: 0.000028 - momentum: 0.000000
|
143 |
+
2023-10-17 20:51:42,797 epoch 5 - iter 730/738 - loss 0.03596301 - time (sec): 49.97 - samples/sec: 3302.27 - lr: 0.000028 - momentum: 0.000000
|
144 |
+
2023-10-17 20:51:43,259 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-17 20:51:43,259 EPOCH 5 done: loss 0.0361 - lr: 0.000028
|
146 |
+
2023-10-17 20:51:54,959 DEV : loss 0.18621283769607544 - f1-score (micro avg) 0.8343
|
147 |
+
2023-10-17 20:51:54,989 saving best model
|
148 |
+
2023-10-17 20:51:55,450 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-17 20:52:00,421 epoch 6 - iter 73/738 - loss 0.03314020 - time (sec): 4.97 - samples/sec: 3267.43 - lr: 0.000027 - momentum: 0.000000
|
150 |
+
2023-10-17 20:52:05,581 epoch 6 - iter 146/738 - loss 0.03144585 - time (sec): 10.13 - samples/sec: 3177.13 - lr: 0.000027 - momentum: 0.000000
|
151 |
+
2023-10-17 20:52:10,154 epoch 6 - iter 219/738 - loss 0.02846193 - time (sec): 14.70 - samples/sec: 3204.07 - lr: 0.000026 - momentum: 0.000000
|
152 |
+
2023-10-17 20:52:15,181 epoch 6 - iter 292/738 - loss 0.02617097 - time (sec): 19.73 - samples/sec: 3233.54 - lr: 0.000026 - momentum: 0.000000
|
153 |
+
2023-10-17 20:52:20,401 epoch 6 - iter 365/738 - loss 0.02459012 - time (sec): 24.95 - samples/sec: 3217.27 - lr: 0.000025 - momentum: 0.000000
|
154 |
+
2023-10-17 20:52:24,933 epoch 6 - iter 438/738 - loss 0.02420569 - time (sec): 29.48 - samples/sec: 3251.94 - lr: 0.000025 - momentum: 0.000000
|
155 |
+
2023-10-17 20:52:29,485 epoch 6 - iter 511/738 - loss 0.02455905 - time (sec): 34.03 - samples/sec: 3278.27 - lr: 0.000024 - momentum: 0.000000
|
156 |
+
2023-10-17 20:52:34,344 epoch 6 - iter 584/738 - loss 0.02479615 - time (sec): 38.89 - samples/sec: 3268.69 - lr: 0.000023 - momentum: 0.000000
|
157 |
+
2023-10-17 20:52:40,085 epoch 6 - iter 657/738 - loss 0.02602451 - time (sec): 44.63 - samples/sec: 3301.26 - lr: 0.000023 - momentum: 0.000000
|
158 |
+
2023-10-17 20:52:45,132 epoch 6 - iter 730/738 - loss 0.02514643 - time (sec): 49.68 - samples/sec: 3302.41 - lr: 0.000022 - momentum: 0.000000
|
159 |
+
2023-10-17 20:52:45,851 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-17 20:52:45,852 EPOCH 6 done: loss 0.0252 - lr: 0.000022
|
161 |
+
2023-10-17 20:52:57,524 DEV : loss 0.19898028671741486 - f1-score (micro avg) 0.8438
|
162 |
+
2023-10-17 20:52:57,557 saving best model
|
163 |
+
2023-10-17 20:52:58,043 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-17 20:53:02,903 epoch 7 - iter 73/738 - loss 0.01475665 - time (sec): 4.86 - samples/sec: 3133.95 - lr: 0.000022 - momentum: 0.000000
|
165 |
+
2023-10-17 20:53:07,464 epoch 7 - iter 146/738 - loss 0.01079426 - time (sec): 9.42 - samples/sec: 3313.04 - lr: 0.000021 - momentum: 0.000000
|
166 |
+
2023-10-17 20:53:12,052 epoch 7 - iter 219/738 - loss 0.01197725 - time (sec): 14.01 - samples/sec: 3264.54 - lr: 0.000021 - momentum: 0.000000
|
167 |
+
2023-10-17 20:53:16,929 epoch 7 - iter 292/738 - loss 0.01292926 - time (sec): 18.88 - samples/sec: 3284.79 - lr: 0.000020 - momentum: 0.000000
|
168 |
+
2023-10-17 20:53:21,733 epoch 7 - iter 365/738 - loss 0.01589341 - time (sec): 23.69 - samples/sec: 3296.02 - lr: 0.000020 - momentum: 0.000000
|
169 |
+
2023-10-17 20:53:26,782 epoch 7 - iter 438/738 - loss 0.01627966 - time (sec): 28.74 - samples/sec: 3340.14 - lr: 0.000019 - momentum: 0.000000
|
170 |
+
2023-10-17 20:53:32,829 epoch 7 - iter 511/738 - loss 0.01882907 - time (sec): 34.79 - samples/sec: 3352.44 - lr: 0.000018 - momentum: 0.000000
|
171 |
+
2023-10-17 20:53:37,503 epoch 7 - iter 584/738 - loss 0.01875664 - time (sec): 39.46 - samples/sec: 3353.64 - lr: 0.000018 - momentum: 0.000000
|
172 |
+
2023-10-17 20:53:42,403 epoch 7 - iter 657/738 - loss 0.01898628 - time (sec): 44.36 - samples/sec: 3350.33 - lr: 0.000017 - momentum: 0.000000
|
173 |
+
2023-10-17 20:53:47,332 epoch 7 - iter 730/738 - loss 0.01903040 - time (sec): 49.29 - samples/sec: 3341.12 - lr: 0.000017 - momentum: 0.000000
|
174 |
+
2023-10-17 20:53:47,894 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 20:53:47,894 EPOCH 7 done: loss 0.0194 - lr: 0.000017
|
176 |
+
2023-10-17 20:53:59,398 DEV : loss 0.2004670649766922 - f1-score (micro avg) 0.8497
|
177 |
+
2023-10-17 20:53:59,432 saving best model
|
178 |
+
2023-10-17 20:53:59,916 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-17 20:54:04,698 epoch 8 - iter 73/738 - loss 0.01638856 - time (sec): 4.78 - samples/sec: 3252.98 - lr: 0.000016 - momentum: 0.000000
|
180 |
+
2023-10-17 20:54:09,704 epoch 8 - iter 146/738 - loss 0.01268644 - time (sec): 9.79 - samples/sec: 3215.86 - lr: 0.000016 - momentum: 0.000000
|
181 |
+
2023-10-17 20:54:14,523 epoch 8 - iter 219/738 - loss 0.01154489 - time (sec): 14.61 - samples/sec: 3238.88 - lr: 0.000015 - momentum: 0.000000
|
182 |
+
2023-10-17 20:54:19,163 epoch 8 - iter 292/738 - loss 0.01121476 - time (sec): 19.25 - samples/sec: 3253.95 - lr: 0.000015 - momentum: 0.000000
|
183 |
+
2023-10-17 20:54:25,326 epoch 8 - iter 365/738 - loss 0.01237795 - time (sec): 25.41 - samples/sec: 3256.44 - lr: 0.000014 - momentum: 0.000000
|
184 |
+
2023-10-17 20:54:31,141 epoch 8 - iter 438/738 - loss 0.01216074 - time (sec): 31.22 - samples/sec: 3254.28 - lr: 0.000013 - momentum: 0.000000
|
185 |
+
2023-10-17 20:54:36,019 epoch 8 - iter 511/738 - loss 0.01129103 - time (sec): 36.10 - samples/sec: 3253.41 - lr: 0.000013 - momentum: 0.000000
|
186 |
+
2023-10-17 20:54:41,054 epoch 8 - iter 584/738 - loss 0.01142640 - time (sec): 41.14 - samples/sec: 3261.75 - lr: 0.000012 - momentum: 0.000000
|
187 |
+
2023-10-17 20:54:45,906 epoch 8 - iter 657/738 - loss 0.01201789 - time (sec): 45.99 - samples/sec: 3247.76 - lr: 0.000012 - momentum: 0.000000
|
188 |
+
2023-10-17 20:54:50,232 epoch 8 - iter 730/738 - loss 0.01177942 - time (sec): 50.31 - samples/sec: 3270.00 - lr: 0.000011 - momentum: 0.000000
|
189 |
+
2023-10-17 20:54:50,766 ----------------------------------------------------------------------------------------------------
|
190 |
+
2023-10-17 20:54:50,767 EPOCH 8 done: loss 0.0118 - lr: 0.000011
|
191 |
+
2023-10-17 20:55:02,150 DEV : loss 0.198430597782135 - f1-score (micro avg) 0.8567
|
192 |
+
2023-10-17 20:55:02,181 saving best model
|
193 |
+
2023-10-17 20:55:02,675 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-17 20:55:07,868 epoch 9 - iter 73/738 - loss 0.01519568 - time (sec): 5.19 - samples/sec: 3455.39 - lr: 0.000011 - momentum: 0.000000
|
195 |
+
2023-10-17 20:55:12,960 epoch 9 - iter 146/738 - loss 0.00937816 - time (sec): 10.28 - samples/sec: 3348.66 - lr: 0.000010 - momentum: 0.000000
|
196 |
+
2023-10-17 20:55:17,958 epoch 9 - iter 219/738 - loss 0.00848427 - time (sec): 15.28 - samples/sec: 3265.03 - lr: 0.000010 - momentum: 0.000000
|
197 |
+
2023-10-17 20:55:23,209 epoch 9 - iter 292/738 - loss 0.00799314 - time (sec): 20.53 - samples/sec: 3258.21 - lr: 0.000009 - momentum: 0.000000
|
198 |
+
2023-10-17 20:55:27,918 epoch 9 - iter 365/738 - loss 0.00807520 - time (sec): 25.24 - samples/sec: 3272.15 - lr: 0.000008 - momentum: 0.000000
|
199 |
+
2023-10-17 20:55:33,094 epoch 9 - iter 438/738 - loss 0.00784412 - time (sec): 30.42 - samples/sec: 3286.92 - lr: 0.000008 - momentum: 0.000000
|
200 |
+
2023-10-17 20:55:38,313 epoch 9 - iter 511/738 - loss 0.00889544 - time (sec): 35.64 - samples/sec: 3274.78 - lr: 0.000007 - momentum: 0.000000
|
201 |
+
2023-10-17 20:55:43,202 epoch 9 - iter 584/738 - loss 0.00926012 - time (sec): 40.52 - samples/sec: 3281.89 - lr: 0.000007 - momentum: 0.000000
|
202 |
+
2023-10-17 20:55:48,350 epoch 9 - iter 657/738 - loss 0.00863704 - time (sec): 45.67 - samples/sec: 3283.07 - lr: 0.000006 - momentum: 0.000000
|
203 |
+
2023-10-17 20:55:52,805 epoch 9 - iter 730/738 - loss 0.00798481 - time (sec): 50.13 - samples/sec: 3290.70 - lr: 0.000006 - momentum: 0.000000
|
204 |
+
2023-10-17 20:55:53,313 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-17 20:55:53,313 EPOCH 9 done: loss 0.0079 - lr: 0.000006
|
206 |
+
2023-10-17 20:56:04,772 DEV : loss 0.2114827036857605 - f1-score (micro avg) 0.8592
|
207 |
+
2023-10-17 20:56:04,806 saving best model
|
208 |
+
2023-10-17 20:56:05,225 ----------------------------------------------------------------------------------------------------
|
209 |
+
2023-10-17 20:56:10,423 epoch 10 - iter 73/738 - loss 0.00697079 - time (sec): 5.20 - samples/sec: 3137.08 - lr: 0.000005 - momentum: 0.000000
|
210 |
+
2023-10-17 20:56:16,026 epoch 10 - iter 146/738 - loss 0.00806024 - time (sec): 10.80 - samples/sec: 3238.10 - lr: 0.000004 - momentum: 0.000000
|
211 |
+
2023-10-17 20:56:20,994 epoch 10 - iter 219/738 - loss 0.00854713 - time (sec): 15.77 - samples/sec: 3195.31 - lr: 0.000004 - momentum: 0.000000
|
212 |
+
2023-10-17 20:56:26,151 epoch 10 - iter 292/738 - loss 0.00725008 - time (sec): 20.92 - samples/sec: 3224.88 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-17 20:56:30,811 epoch 10 - iter 365/738 - loss 0.00633495 - time (sec): 25.58 - samples/sec: 3248.59 - lr: 0.000003 - momentum: 0.000000
|
214 |
+
2023-10-17 20:56:35,376 epoch 10 - iter 438/738 - loss 0.00772199 - time (sec): 30.15 - samples/sec: 3274.22 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-17 20:56:40,499 epoch 10 - iter 511/738 - loss 0.00693419 - time (sec): 35.27 - samples/sec: 3250.26 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-17 20:56:45,335 epoch 10 - iter 584/738 - loss 0.00661804 - time (sec): 40.11 - samples/sec: 3269.64 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-17 20:56:50,238 epoch 10 - iter 657/738 - loss 0.00621935 - time (sec): 45.01 - samples/sec: 3277.65 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-17 20:56:55,354 epoch 10 - iter 730/738 - loss 0.00579273 - time (sec): 50.13 - samples/sec: 3289.07 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-10-17 20:56:55,849 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-17 20:56:55,849 EPOCH 10 done: loss 0.0057 - lr: 0.000000
|
221 |
+
2023-10-17 20:57:07,458 DEV : loss 0.2092556357383728 - f1-score (micro avg) 0.8575
|
222 |
+
2023-10-17 20:57:07,845 ----------------------------------------------------------------------------------------------------
|
223 |
+
2023-10-17 20:57:07,846 Loading model from best epoch ...
|
224 |
+
2023-10-17 20:57:09,560 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
|
225 |
+
2023-10-17 20:57:15,748
|
226 |
+
Results:
|
227 |
+
- F-score (micro) 0.8061
|
228 |
+
- F-score (macro) 0.7226
|
229 |
+
- Accuracy 0.6955
|
230 |
+
|
231 |
+
By class:
|
232 |
+
precision recall f1-score support
|
233 |
+
|
234 |
+
loc 0.8573 0.8753 0.8662 858
|
235 |
+
pers 0.7504 0.8175 0.7825 537
|
236 |
+
org 0.6579 0.5682 0.6098 132
|
237 |
+
time 0.5806 0.6667 0.6207 54
|
238 |
+
prod 0.8333 0.6557 0.7339 61
|
239 |
+
|
240 |
+
micro avg 0.7958 0.8167 0.8061 1642
|
241 |
+
macro avg 0.7359 0.7167 0.7226 1642
|
242 |
+
weighted avg 0.7963 0.8167 0.8052 1642
|
243 |
+
|
244 |
+
2023-10-17 20:57:15,748 ----------------------------------------------------------------------------------------------------
|