Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +237 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:454345c68d6db2b85ec82024b6f8ad4483653dd63fc197efc6d589f47e32e707
|
3 |
+
size 443311111
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 23:15:50 0.0000 0.2558 0.1092 0.5422 0.6465 0.5898 0.4239
|
3 |
+
2 23:17:47 0.0000 0.0843 0.1188 0.5553 0.7815 0.6492 0.4886
|
4 |
+
3 23:19:49 0.0000 0.0595 0.1852 0.5122 0.7437 0.6066 0.4443
|
5 |
+
4 23:21:48 0.0000 0.0433 0.2180 0.5433 0.7254 0.6213 0.4591
|
6 |
+
5 23:23:44 0.0000 0.0324 0.2920 0.5420 0.7757 0.6381 0.4785
|
7 |
+
6 23:25:41 0.0000 0.0232 0.3259 0.5529 0.7300 0.6292 0.4660
|
8 |
+
7 23:27:38 0.0000 0.0151 0.3449 0.5478 0.7677 0.6394 0.4789
|
9 |
+
8 23:29:38 0.0000 0.0105 0.3785 0.5447 0.7872 0.6439 0.4831
|
10 |
+
9 23:31:36 0.0000 0.0070 0.3912 0.5568 0.7632 0.6438 0.4830
|
11 |
+
10 23:33:34 0.0000 0.0042 0.4097 0.5543 0.7654 0.6430 0.4823
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,237 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-14 23:13:54,062 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-14 23:13:54,063 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-14 23:13:54,063 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
|
53 |
+
2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-14 23:13:54,063 Train: 14465 sentences
|
55 |
+
2023-10-14 23:13:54,063 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-14 23:13:54,063 Training Params:
|
58 |
+
2023-10-14 23:13:54,063 - learning_rate: "5e-05"
|
59 |
+
2023-10-14 23:13:54,063 - mini_batch_size: "8"
|
60 |
+
2023-10-14 23:13:54,063 - max_epochs: "10"
|
61 |
+
2023-10-14 23:13:54,063 - shuffle: "True"
|
62 |
+
2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-14 23:13:54,063 Plugins:
|
64 |
+
2023-10-14 23:13:54,063 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-14 23:13:54,063 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-14 23:13:54,063 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-14 23:13:54,063 Computation:
|
70 |
+
2023-10-14 23:13:54,063 - compute on device: cuda:0
|
71 |
+
2023-10-14 23:13:54,064 - embedding storage: none
|
72 |
+
2023-10-14 23:13:54,064 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-14 23:13:54,064 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
|
74 |
+
2023-10-14 23:13:54,064 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-14 23:13:54,064 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-14 23:14:05,117 epoch 1 - iter 180/1809 - loss 1.38557352 - time (sec): 11.05 - samples/sec: 3462.40 - lr: 0.000005 - momentum: 0.000000
|
77 |
+
2023-10-14 23:14:16,159 epoch 1 - iter 360/1809 - loss 0.80804825 - time (sec): 22.09 - samples/sec: 3436.57 - lr: 0.000010 - momentum: 0.000000
|
78 |
+
2023-10-14 23:14:27,057 epoch 1 - iter 540/1809 - loss 0.59504882 - time (sec): 32.99 - samples/sec: 3419.21 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-14 23:14:38,311 epoch 1 - iter 720/1809 - loss 0.47345095 - time (sec): 44.25 - samples/sec: 3442.23 - lr: 0.000020 - momentum: 0.000000
|
80 |
+
2023-10-14 23:14:49,306 epoch 1 - iter 900/1809 - loss 0.40291276 - time (sec): 55.24 - samples/sec: 3439.24 - lr: 0.000025 - momentum: 0.000000
|
81 |
+
2023-10-14 23:15:00,620 epoch 1 - iter 1080/1809 - loss 0.35417304 - time (sec): 66.56 - samples/sec: 3444.05 - lr: 0.000030 - momentum: 0.000000
|
82 |
+
2023-10-14 23:15:11,503 epoch 1 - iter 1260/1809 - loss 0.32111486 - time (sec): 77.44 - samples/sec: 3438.10 - lr: 0.000035 - momentum: 0.000000
|
83 |
+
2023-10-14 23:15:22,452 epoch 1 - iter 1440/1809 - loss 0.29407849 - time (sec): 88.39 - samples/sec: 3437.09 - lr: 0.000040 - momentum: 0.000000
|
84 |
+
2023-10-14 23:15:33,541 epoch 1 - iter 1620/1809 - loss 0.27249387 - time (sec): 99.48 - samples/sec: 3427.40 - lr: 0.000045 - momentum: 0.000000
|
85 |
+
2023-10-14 23:15:44,480 epoch 1 - iter 1800/1809 - loss 0.25650520 - time (sec): 110.42 - samples/sec: 3426.34 - lr: 0.000050 - momentum: 0.000000
|
86 |
+
2023-10-14 23:15:44,991 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-14 23:15:44,991 EPOCH 1 done: loss 0.2558 - lr: 0.000050
|
88 |
+
2023-10-14 23:15:50,308 DEV : loss 0.10922493785619736 - f1-score (micro avg) 0.5898
|
89 |
+
2023-10-14 23:15:50,338 saving best model
|
90 |
+
2023-10-14 23:15:50,724 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-14 23:16:01,639 epoch 2 - iter 180/1809 - loss 0.09115416 - time (sec): 10.91 - samples/sec: 3394.64 - lr: 0.000049 - momentum: 0.000000
|
92 |
+
2023-10-14 23:16:12,680 epoch 2 - iter 360/1809 - loss 0.09214244 - time (sec): 21.95 - samples/sec: 3428.42 - lr: 0.000049 - momentum: 0.000000
|
93 |
+
2023-10-14 23:16:23,647 epoch 2 - iter 540/1809 - loss 0.08998102 - time (sec): 32.92 - samples/sec: 3449.64 - lr: 0.000048 - momentum: 0.000000
|
94 |
+
2023-10-14 23:16:34,929 epoch 2 - iter 720/1809 - loss 0.08845021 - time (sec): 44.20 - samples/sec: 3448.58 - lr: 0.000048 - momentum: 0.000000
|
95 |
+
2023-10-14 23:16:45,932 epoch 2 - iter 900/1809 - loss 0.08692869 - time (sec): 55.21 - samples/sec: 3466.90 - lr: 0.000047 - momentum: 0.000000
|
96 |
+
2023-10-14 23:16:57,081 epoch 2 - iter 1080/1809 - loss 0.08601996 - time (sec): 66.36 - samples/sec: 3452.03 - lr: 0.000047 - momentum: 0.000000
|
97 |
+
2023-10-14 23:17:08,054 epoch 2 - iter 1260/1809 - loss 0.08669678 - time (sec): 77.33 - samples/sec: 3443.25 - lr: 0.000046 - momentum: 0.000000
|
98 |
+
2023-10-14 23:17:18,723 epoch 2 - iter 1440/1809 - loss 0.08604451 - time (sec): 88.00 - samples/sec: 3433.05 - lr: 0.000046 - momentum: 0.000000
|
99 |
+
2023-10-14 23:17:29,971 epoch 2 - iter 1620/1809 - loss 0.08524993 - time (sec): 99.25 - samples/sec: 3434.57 - lr: 0.000045 - momentum: 0.000000
|
100 |
+
2023-10-14 23:17:40,783 epoch 2 - iter 1800/1809 - loss 0.08442574 - time (sec): 110.06 - samples/sec: 3435.34 - lr: 0.000044 - momentum: 0.000000
|
101 |
+
2023-10-14 23:17:41,312 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-14 23:17:41,313 EPOCH 2 done: loss 0.0843 - lr: 0.000044
|
103 |
+
2023-10-14 23:17:47,616 DEV : loss 0.11877861618995667 - f1-score (micro avg) 0.6492
|
104 |
+
2023-10-14 23:17:47,664 saving best model
|
105 |
+
2023-10-14 23:17:48,158 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-14 23:17:59,754 epoch 3 - iter 180/1809 - loss 0.05224098 - time (sec): 11.59 - samples/sec: 3334.61 - lr: 0.000044 - momentum: 0.000000
|
107 |
+
2023-10-14 23:18:10,717 epoch 3 - iter 360/1809 - loss 0.05757660 - time (sec): 22.56 - samples/sec: 3391.79 - lr: 0.000043 - momentum: 0.000000
|
108 |
+
2023-10-14 23:18:21,815 epoch 3 - iter 540/1809 - loss 0.05923171 - time (sec): 33.65 - samples/sec: 3374.86 - lr: 0.000043 - momentum: 0.000000
|
109 |
+
2023-10-14 23:18:32,890 epoch 3 - iter 720/1809 - loss 0.05951510 - time (sec): 44.73 - samples/sec: 3404.49 - lr: 0.000042 - momentum: 0.000000
|
110 |
+
2023-10-14 23:18:44,341 epoch 3 - iter 900/1809 - loss 0.05917685 - time (sec): 56.18 - samples/sec: 3376.67 - lr: 0.000042 - momentum: 0.000000
|
111 |
+
2023-10-14 23:18:55,811 epoch 3 - iter 1080/1809 - loss 0.05994787 - time (sec): 67.65 - samples/sec: 3358.55 - lr: 0.000041 - momentum: 0.000000
|
112 |
+
2023-10-14 23:19:07,534 epoch 3 - iter 1260/1809 - loss 0.05928422 - time (sec): 79.37 - samples/sec: 3339.69 - lr: 0.000041 - momentum: 0.000000
|
113 |
+
2023-10-14 23:19:19,370 epoch 3 - iter 1440/1809 - loss 0.05933673 - time (sec): 91.21 - samples/sec: 3312.36 - lr: 0.000040 - momentum: 0.000000
|
114 |
+
2023-10-14 23:19:30,948 epoch 3 - iter 1620/1809 - loss 0.06123094 - time (sec): 102.79 - samples/sec: 3312.93 - lr: 0.000039 - momentum: 0.000000
|
115 |
+
2023-10-14 23:19:41,857 epoch 3 - iter 1800/1809 - loss 0.05963446 - time (sec): 113.70 - samples/sec: 3325.96 - lr: 0.000039 - momentum: 0.000000
|
116 |
+
2023-10-14 23:19:42,476 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-14 23:19:42,476 EPOCH 3 done: loss 0.0595 - lr: 0.000039
|
118 |
+
2023-10-14 23:19:49,864 DEV : loss 0.1852269172668457 - f1-score (micro avg) 0.6066
|
119 |
+
2023-10-14 23:19:49,907 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-10-14 23:20:01,922 epoch 4 - iter 180/1809 - loss 0.03466777 - time (sec): 12.01 - samples/sec: 3240.89 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-14 23:20:12,957 epoch 4 - iter 360/1809 - loss 0.03740752 - time (sec): 23.05 - samples/sec: 3315.19 - lr: 0.000038 - momentum: 0.000000
|
122 |
+
2023-10-14 23:20:24,052 epoch 4 - iter 540/1809 - loss 0.04243438 - time (sec): 34.14 - samples/sec: 3350.72 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-14 23:20:35,171 epoch 4 - iter 720/1809 - loss 0.04361553 - time (sec): 45.26 - samples/sec: 3339.38 - lr: 0.000037 - momentum: 0.000000
|
124 |
+
2023-10-14 23:20:46,045 epoch 4 - iter 900/1809 - loss 0.04259376 - time (sec): 56.14 - samples/sec: 3357.05 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-14 23:20:57,326 epoch 4 - iter 1080/1809 - loss 0.04216192 - time (sec): 67.42 - samples/sec: 3361.99 - lr: 0.000036 - momentum: 0.000000
|
126 |
+
2023-10-14 23:21:08,492 epoch 4 - iter 1260/1809 - loss 0.04211285 - time (sec): 78.58 - samples/sec: 3368.00 - lr: 0.000035 - momentum: 0.000000
|
127 |
+
2023-10-14 23:21:19,539 epoch 4 - iter 1440/1809 - loss 0.04181257 - time (sec): 89.63 - samples/sec: 3382.30 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-14 23:21:30,326 epoch 4 - iter 1620/1809 - loss 0.04312238 - time (sec): 100.42 - samples/sec: 3393.06 - lr: 0.000034 - momentum: 0.000000
|
129 |
+
2023-10-14 23:21:41,155 epoch 4 - iter 1800/1809 - loss 0.04326423 - time (sec): 111.25 - samples/sec: 3398.47 - lr: 0.000033 - momentum: 0.000000
|
130 |
+
2023-10-14 23:21:41,677 ----------------------------------------------------------------------------------------------------
|
131 |
+
2023-10-14 23:21:41,677 EPOCH 4 done: loss 0.0433 - lr: 0.000033
|
132 |
+
2023-10-14 23:21:48,355 DEV : loss 0.21803739666938782 - f1-score (micro avg) 0.6213
|
133 |
+
2023-10-14 23:21:48,388 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-14 23:21:59,544 epoch 5 - iter 180/1809 - loss 0.03382032 - time (sec): 11.16 - samples/sec: 3385.98 - lr: 0.000033 - momentum: 0.000000
|
135 |
+
2023-10-14 23:22:10,312 epoch 5 - iter 360/1809 - loss 0.02988533 - time (sec): 21.92 - samples/sec: 3446.91 - lr: 0.000032 - momentum: 0.000000
|
136 |
+
2023-10-14 23:22:21,368 epoch 5 - iter 540/1809 - loss 0.02897812 - time (sec): 32.98 - samples/sec: 3436.13 - lr: 0.000032 - momentum: 0.000000
|
137 |
+
2023-10-14 23:22:32,201 epoch 5 - iter 720/1809 - loss 0.02920811 - time (sec): 43.81 - samples/sec: 3422.99 - lr: 0.000031 - momentum: 0.000000
|
138 |
+
2023-10-14 23:22:43,261 epoch 5 - iter 900/1809 - loss 0.03020770 - time (sec): 54.87 - samples/sec: 3422.06 - lr: 0.000031 - momentum: 0.000000
|
139 |
+
2023-10-14 23:22:54,098 epoch 5 - iter 1080/1809 - loss 0.03043284 - time (sec): 65.71 - samples/sec: 3422.19 - lr: 0.000030 - momentum: 0.000000
|
140 |
+
2023-10-14 23:23:04,857 epoch 5 - iter 1260/1809 - loss 0.03101516 - time (sec): 76.47 - samples/sec: 3423.04 - lr: 0.000029 - momentum: 0.000000
|
141 |
+
2023-10-14 23:23:16,285 epoch 5 - iter 1440/1809 - loss 0.03128116 - time (sec): 87.90 - samples/sec: 3425.18 - lr: 0.000029 - momentum: 0.000000
|
142 |
+
2023-10-14 23:23:27,369 epoch 5 - iter 1620/1809 - loss 0.03158724 - time (sec): 98.98 - samples/sec: 3432.98 - lr: 0.000028 - momentum: 0.000000
|
143 |
+
2023-10-14 23:23:38,672 epoch 5 - iter 1800/1809 - loss 0.03222026 - time (sec): 110.28 - samples/sec: 3430.35 - lr: 0.000028 - momentum: 0.000000
|
144 |
+
2023-10-14 23:23:39,199 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-14 23:23:39,199 EPOCH 5 done: loss 0.0324 - lr: 0.000028
|
146 |
+
2023-10-14 23:23:44,781 DEV : loss 0.29204583168029785 - f1-score (micro avg) 0.6381
|
147 |
+
2023-10-14 23:23:44,816 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-14 23:23:55,602 epoch 6 - iter 180/1809 - loss 0.01835359 - time (sec): 10.78 - samples/sec: 3331.85 - lr: 0.000027 - momentum: 0.000000
|
149 |
+
2023-10-14 23:24:06,716 epoch 6 - iter 360/1809 - loss 0.02104928 - time (sec): 21.90 - samples/sec: 3383.12 - lr: 0.000027 - momentum: 0.000000
|
150 |
+
2023-10-14 23:24:17,593 epoch 6 - iter 540/1809 - loss 0.02382736 - time (sec): 32.78 - samples/sec: 3387.15 - lr: 0.000026 - momentum: 0.000000
|
151 |
+
2023-10-14 23:24:29,242 epoch 6 - iter 720/1809 - loss 0.02520287 - time (sec): 44.43 - samples/sec: 3348.00 - lr: 0.000026 - momentum: 0.000000
|
152 |
+
2023-10-14 23:24:40,477 epoch 6 - iter 900/1809 - loss 0.02464033 - time (sec): 55.66 - samples/sec: 3367.84 - lr: 0.000025 - momentum: 0.000000
|
153 |
+
2023-10-14 23:24:51,833 epoch 6 - iter 1080/1809 - loss 0.02490246 - time (sec): 67.02 - samples/sec: 3365.05 - lr: 0.000024 - momentum: 0.000000
|
154 |
+
2023-10-14 23:25:02,971 epoch 6 - iter 1260/1809 - loss 0.02386734 - time (sec): 78.15 - samples/sec: 3385.64 - lr: 0.000024 - momentum: 0.000000
|
155 |
+
2023-10-14 23:25:14,059 epoch 6 - iter 1440/1809 - loss 0.02384248 - time (sec): 89.24 - samples/sec: 3407.87 - lr: 0.000023 - momentum: 0.000000
|
156 |
+
2023-10-14 23:25:24,684 epoch 6 - iter 1620/1809 - loss 0.02370424 - time (sec): 99.87 - samples/sec: 3405.70 - lr: 0.000023 - momentum: 0.000000
|
157 |
+
2023-10-14 23:25:35,893 epoch 6 - iter 1800/1809 - loss 0.02323564 - time (sec): 111.08 - samples/sec: 3403.90 - lr: 0.000022 - momentum: 0.000000
|
158 |
+
2023-10-14 23:25:36,460 ----------------------------------------------------------------------------------------------------
|
159 |
+
2023-10-14 23:25:36,460 EPOCH 6 done: loss 0.0232 - lr: 0.000022
|
160 |
+
2023-10-14 23:25:41,960 DEV : loss 0.325937956571579 - f1-score (micro avg) 0.6292
|
161 |
+
2023-10-14 23:25:41,990 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-14 23:25:53,211 epoch 7 - iter 180/1809 - loss 0.01268428 - time (sec): 11.22 - samples/sec: 3309.54 - lr: 0.000022 - momentum: 0.000000
|
163 |
+
2023-10-14 23:26:04,242 epoch 7 - iter 360/1809 - loss 0.01253572 - time (sec): 22.25 - samples/sec: 3434.72 - lr: 0.000021 - momentum: 0.000000
|
164 |
+
2023-10-14 23:26:15,278 epoch 7 - iter 540/1809 - loss 0.01170739 - time (sec): 33.29 - samples/sec: 3474.17 - lr: 0.000021 - momentum: 0.000000
|
165 |
+
2023-10-14 23:26:26,064 epoch 7 - iter 720/1809 - loss 0.01321558 - time (sec): 44.07 - samples/sec: 3453.48 - lr: 0.000020 - momentum: 0.000000
|
166 |
+
2023-10-14 23:26:37,214 epoch 7 - iter 900/1809 - loss 0.01522502 - time (sec): 55.22 - samples/sec: 3442.44 - lr: 0.000019 - momentum: 0.000000
|
167 |
+
2023-10-14 23:26:48,017 epoch 7 - iter 1080/1809 - loss 0.01503871 - time (sec): 66.03 - samples/sec: 3451.90 - lr: 0.000019 - momentum: 0.000000
|
168 |
+
2023-10-14 23:26:58,958 epoch 7 - iter 1260/1809 - loss 0.01509298 - time (sec): 76.97 - samples/sec: 3464.90 - lr: 0.000018 - momentum: 0.000000
|
169 |
+
2023-10-14 23:27:09,510 epoch 7 - iter 1440/1809 - loss 0.01494629 - time (sec): 87.52 - samples/sec: 3455.94 - lr: 0.000018 - momentum: 0.000000
|
170 |
+
2023-10-14 23:27:21,399 epoch 7 - iter 1620/1809 - loss 0.01502665 - time (sec): 99.41 - samples/sec: 3423.29 - lr: 0.000017 - momentum: 0.000000
|
171 |
+
2023-10-14 23:27:32,509 epoch 7 - iter 1800/1809 - loss 0.01515155 - time (sec): 110.52 - samples/sec: 3422.67 - lr: 0.000017 - momentum: 0.000000
|
172 |
+
2023-10-14 23:27:33,040 ----------------------------------------------------------------------------------------------------
|
173 |
+
2023-10-14 23:27:33,041 EPOCH 7 done: loss 0.0151 - lr: 0.000017
|
174 |
+
2023-10-14 23:27:38,689 DEV : loss 0.34486714005470276 - f1-score (micro avg) 0.6394
|
175 |
+
2023-10-14 23:27:38,721 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-14 23:27:49,645 epoch 8 - iter 180/1809 - loss 0.00504407 - time (sec): 10.92 - samples/sec: 3368.16 - lr: 0.000016 - momentum: 0.000000
|
177 |
+
2023-10-14 23:28:00,921 epoch 8 - iter 360/1809 - loss 0.00779998 - time (sec): 22.20 - samples/sec: 3393.34 - lr: 0.000016 - momentum: 0.000000
|
178 |
+
2023-10-14 23:28:12,199 epoch 8 - iter 540/1809 - loss 0.00806594 - time (sec): 33.48 - samples/sec: 3389.11 - lr: 0.000015 - momentum: 0.000000
|
179 |
+
2023-10-14 23:28:23,292 epoch 8 - iter 720/1809 - loss 0.00804521 - time (sec): 44.57 - samples/sec: 3387.81 - lr: 0.000014 - momentum: 0.000000
|
180 |
+
2023-10-14 23:28:34,228 epoch 8 - iter 900/1809 - loss 0.00787755 - time (sec): 55.51 - samples/sec: 3387.75 - lr: 0.000014 - momentum: 0.000000
|
181 |
+
2023-10-14 23:28:45,339 epoch 8 - iter 1080/1809 - loss 0.00837508 - time (sec): 66.62 - samples/sec: 3398.85 - lr: 0.000013 - momentum: 0.000000
|
182 |
+
2023-10-14 23:28:56,214 epoch 8 - iter 1260/1809 - loss 0.00920763 - time (sec): 77.49 - samples/sec: 3418.01 - lr: 0.000013 - momentum: 0.000000
|
183 |
+
2023-10-14 23:29:07,281 epoch 8 - iter 1440/1809 - loss 0.01023911 - time (sec): 88.56 - samples/sec: 3413.01 - lr: 0.000012 - momentum: 0.000000
|
184 |
+
2023-10-14 23:29:18,361 epoch 8 - iter 1620/1809 - loss 0.01031827 - time (sec): 99.64 - samples/sec: 3406.77 - lr: 0.000012 - momentum: 0.000000
|
185 |
+
2023-10-14 23:29:29,759 epoch 8 - iter 1800/1809 - loss 0.01052622 - time (sec): 111.04 - samples/sec: 3408.30 - lr: 0.000011 - momentum: 0.000000
|
186 |
+
2023-10-14 23:29:30,243 ----------------------------------------------------------------------------------------------------
|
187 |
+
2023-10-14 23:29:30,243 EPOCH 8 done: loss 0.0105 - lr: 0.000011
|
188 |
+
2023-10-14 23:29:38,316 DEV : loss 0.37847810983657837 - f1-score (micro avg) 0.6439
|
189 |
+
2023-10-14 23:29:38,350 ----------------------------------------------------------------------------------------------------
|
190 |
+
2023-10-14 23:29:49,652 epoch 9 - iter 180/1809 - loss 0.00514140 - time (sec): 11.30 - samples/sec: 3379.90 - lr: 0.000011 - momentum: 0.000000
|
191 |
+
2023-10-14 23:30:00,849 epoch 9 - iter 360/1809 - loss 0.00528666 - time (sec): 22.50 - samples/sec: 3363.85 - lr: 0.000010 - momentum: 0.000000
|
192 |
+
2023-10-14 23:30:11,917 epoch 9 - iter 540/1809 - loss 0.00706337 - time (sec): 33.57 - samples/sec: 3376.04 - lr: 0.000009 - momentum: 0.000000
|
193 |
+
2023-10-14 23:30:23,087 epoch 9 - iter 720/1809 - loss 0.00660810 - time (sec): 44.74 - samples/sec: 3397.13 - lr: 0.000009 - momentum: 0.000000
|
194 |
+
2023-10-14 23:30:34,171 epoch 9 - iter 900/1809 - loss 0.00744791 - time (sec): 55.82 - samples/sec: 3402.85 - lr: 0.000008 - momentum: 0.000000
|
195 |
+
2023-10-14 23:30:45,401 epoch 9 - iter 1080/1809 - loss 0.00728486 - time (sec): 67.05 - samples/sec: 3414.58 - lr: 0.000008 - momentum: 0.000000
|
196 |
+
2023-10-14 23:30:56,449 epoch 9 - iter 1260/1809 - loss 0.00750422 - time (sec): 78.10 - samples/sec: 3417.96 - lr: 0.000007 - momentum: 0.000000
|
197 |
+
2023-10-14 23:31:07,262 epoch 9 - iter 1440/1809 - loss 0.00739246 - time (sec): 88.91 - samples/sec: 3414.94 - lr: 0.000007 - momentum: 0.000000
|
198 |
+
2023-10-14 23:31:18,285 epoch 9 - iter 1620/1809 - loss 0.00714565 - time (sec): 99.93 - samples/sec: 3420.75 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-14 23:31:28,970 epoch 9 - iter 1800/1809 - loss 0.00700843 - time (sec): 110.62 - samples/sec: 3418.39 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-14 23:31:29,513 ----------------------------------------------------------------------------------------------------
|
201 |
+
2023-10-14 23:31:29,513 EPOCH 9 done: loss 0.0070 - lr: 0.000006
|
202 |
+
2023-10-14 23:31:35,977 DEV : loss 0.39124542474746704 - f1-score (micro avg) 0.6438
|
203 |
+
2023-10-14 23:31:36,014 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-14 23:31:47,439 epoch 10 - iter 180/1809 - loss 0.00408599 - time (sec): 11.42 - samples/sec: 3336.10 - lr: 0.000005 - momentum: 0.000000
|
205 |
+
2023-10-14 23:31:58,371 epoch 10 - iter 360/1809 - loss 0.00331461 - time (sec): 22.36 - samples/sec: 3400.13 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-14 23:32:09,317 epoch 10 - iter 540/1809 - loss 0.00293145 - time (sec): 33.30 - samples/sec: 3393.56 - lr: 0.000004 - momentum: 0.000000
|
207 |
+
2023-10-14 23:32:20,482 epoch 10 - iter 720/1809 - loss 0.00380863 - time (sec): 44.47 - samples/sec: 3405.89 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-14 23:32:31,512 epoch 10 - iter 900/1809 - loss 0.00352863 - time (sec): 55.50 - samples/sec: 3420.09 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-14 23:32:42,367 epoch 10 - iter 1080/1809 - loss 0.00358662 - time (sec): 66.35 - samples/sec: 3425.15 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-14 23:32:53,206 epoch 10 - iter 1260/1809 - loss 0.00382641 - time (sec): 77.19 - samples/sec: 3431.10 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-14 23:33:04,379 epoch 10 - iter 1440/1809 - loss 0.00369667 - time (sec): 88.36 - samples/sec: 3435.24 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-14 23:33:15,225 epoch 10 - iter 1620/1809 - loss 0.00426803 - time (sec): 99.21 - samples/sec: 3416.99 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-14 23:33:26,612 epoch 10 - iter 1800/1809 - loss 0.00418167 - time (sec): 110.60 - samples/sec: 3421.02 - lr: 0.000000 - momentum: 0.000000
|
214 |
+
2023-10-14 23:33:27,097 ----------------------------------------------------------------------------------------------------
|
215 |
+
2023-10-14 23:33:27,097 EPOCH 10 done: loss 0.0042 - lr: 0.000000
|
216 |
+
2023-10-14 23:33:34,503 DEV : loss 0.40973159670829773 - f1-score (micro avg) 0.643
|
217 |
+
2023-10-14 23:33:34,924 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-14 23:33:34,926 Loading model from best epoch ...
|
219 |
+
2023-10-14 23:33:36,531 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
|
220 |
+
2023-10-14 23:33:44,358
|
221 |
+
Results:
|
222 |
+
- F-score (micro) 0.6361
|
223 |
+
- F-score (macro) 0.4367
|
224 |
+
- Accuracy 0.4774
|
225 |
+
|
226 |
+
By class:
|
227 |
+
precision recall f1-score support
|
228 |
+
|
229 |
+
loc 0.6206 0.7750 0.6892 591
|
230 |
+
pers 0.5596 0.6975 0.6209 357
|
231 |
+
org 0.0000 0.0000 0.0000 79
|
232 |
+
|
233 |
+
micro avg 0.5911 0.6884 0.6361 1027
|
234 |
+
macro avg 0.3934 0.4908 0.4367 1027
|
235 |
+
weighted avg 0.5516 0.6884 0.6125 1027
|
236 |
+
|
237 |
+
2023-10-14 23:33:44,359 ----------------------------------------------------------------------------------------------------
|