Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697568660.bce904bcef33.2251.18 +3 -0
- test.tsv +0 -0
- training.log +236 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2b9555b8c0cf24519cb6d747623fd3e0973bf9380a62f59caeb86cc5060c83f5
|
3 |
+
size 440941957
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 18:51:55 0.0000 0.4518 0.0857 0.7548 0.7696 0.7621 0.6224
|
3 |
+
2 18:52:52 0.0000 0.0871 0.0563 0.8789 0.8543 0.8664 0.7729
|
4 |
+
3 18:53:49 0.0000 0.0606 0.0610 0.9110 0.8564 0.8829 0.7994
|
5 |
+
4 18:54:45 0.0000 0.0432 0.0706 0.8862 0.8450 0.8652 0.7695
|
6 |
+
5 18:55:41 0.0000 0.0324 0.0791 0.9126 0.8306 0.8697 0.7746
|
7 |
+
6 18:56:37 0.0000 0.0263 0.0874 0.8989 0.8636 0.8809 0.7977
|
8 |
+
7 18:57:33 0.0000 0.0184 0.1058 0.9079 0.8554 0.8809 0.7954
|
9 |
+
8 18:58:29 0.0000 0.0139 0.1137 0.9060 0.8564 0.8805 0.7956
|
10 |
+
9 18:59:24 0.0000 0.0117 0.1161 0.9008 0.8626 0.8813 0.7960
|
11 |
+
10 19:00:21 0.0000 0.0079 0.1215 0.8986 0.8605 0.8792 0.7933
|
runs/events.out.tfevents.1697568660.bce904bcef33.2251.18
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:eb3bda6f0b3131c93d3901dcfdcd5a52291996e2ce499cf6f02c67719fcd8d50
|
3 |
+
size 407048
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,236 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 18:51:00,787 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 18:51:00,788 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 18:51:00,788 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
48 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
49 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 18:51:00,788 Train: 5777 sentences
|
51 |
+
2023-10-17 18:51:00,788 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 18:51:00,788 Training Params:
|
54 |
+
2023-10-17 18:51:00,788 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 18:51:00,788 - mini_batch_size: "8"
|
56 |
+
2023-10-17 18:51:00,788 - max_epochs: "10"
|
57 |
+
2023-10-17 18:51:00,788 - shuffle: "True"
|
58 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 18:51:00,788 Plugins:
|
60 |
+
2023-10-17 18:51:00,788 - TensorboardLogger
|
61 |
+
2023-10-17 18:51:00,788 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 18:51:00,788 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 18:51:00,788 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 18:51:00,788 Computation:
|
67 |
+
2023-10-17 18:51:00,788 - compute on device: cuda:0
|
68 |
+
2023-10-17 18:51:00,788 - embedding storage: none
|
69 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 18:51:00,788 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
|
71 |
+
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 18:51:00,789 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 18:51:00,789 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 18:51:05,999 epoch 1 - iter 72/723 - loss 2.73572129 - time (sec): 5.21 - samples/sec: 3228.40 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 18:51:11,090 epoch 1 - iter 144/723 - loss 1.71636677 - time (sec): 10.30 - samples/sec: 3297.56 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 18:51:16,224 epoch 1 - iter 216/723 - loss 1.21015689 - time (sec): 15.43 - samples/sec: 3319.86 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 18:51:21,390 epoch 1 - iter 288/723 - loss 0.94420268 - time (sec): 20.60 - samples/sec: 3347.59 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 18:51:26,193 epoch 1 - iter 360/723 - loss 0.78590918 - time (sec): 25.40 - samples/sec: 3395.27 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 18:51:31,477 epoch 1 - iter 432/723 - loss 0.67292077 - time (sec): 30.69 - samples/sec: 3400.44 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 18:51:36,674 epoch 1 - iter 504/723 - loss 0.59756926 - time (sec): 35.88 - samples/sec: 3400.07 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 18:51:42,043 epoch 1 - iter 576/723 - loss 0.53717802 - time (sec): 41.25 - samples/sec: 3387.98 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 18:51:47,518 epoch 1 - iter 648/723 - loss 0.48924915 - time (sec): 46.73 - samples/sec: 3371.37 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 18:51:52,896 epoch 1 - iter 720/723 - loss 0.45323298 - time (sec): 52.11 - samples/sec: 3367.96 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 18:51:53,104 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 18:51:53,105 EPOCH 1 done: loss 0.4518 - lr: 0.000030
|
86 |
+
2023-10-17 18:51:55,813 DEV : loss 0.08571955561637878 - f1-score (micro avg) 0.7621
|
87 |
+
2023-10-17 18:51:55,830 saving best model
|
88 |
+
2023-10-17 18:51:56,351 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 18:52:01,211 epoch 2 - iter 72/723 - loss 0.09693799 - time (sec): 4.86 - samples/sec: 3414.32 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 18:52:06,705 epoch 2 - iter 144/723 - loss 0.09826077 - time (sec): 10.35 - samples/sec: 3307.89 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 18:52:11,708 epoch 2 - iter 216/723 - loss 0.09503249 - time (sec): 15.36 - samples/sec: 3367.88 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 18:52:16,853 epoch 2 - iter 288/723 - loss 0.09891978 - time (sec): 20.50 - samples/sec: 3362.88 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 18:52:22,304 epoch 2 - iter 360/723 - loss 0.09274178 - time (sec): 25.95 - samples/sec: 3380.81 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 18:52:27,873 epoch 2 - iter 432/723 - loss 0.08927963 - time (sec): 31.52 - samples/sec: 3401.91 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 18:52:32,724 epoch 2 - iter 504/723 - loss 0.09084334 - time (sec): 36.37 - samples/sec: 3381.60 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 18:52:38,353 epoch 2 - iter 576/723 - loss 0.09010738 - time (sec): 42.00 - samples/sec: 3375.35 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 18:52:43,617 epoch 2 - iter 648/723 - loss 0.08974176 - time (sec): 47.26 - samples/sec: 3348.81 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 18:52:48,857 epoch 2 - iter 720/723 - loss 0.08723098 - time (sec): 52.50 - samples/sec: 3343.67 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 18:52:49,018 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 18:52:49,019 EPOCH 2 done: loss 0.0871 - lr: 0.000027
|
101 |
+
2023-10-17 18:52:52,238 DEV : loss 0.05628642439842224 - f1-score (micro avg) 0.8664
|
102 |
+
2023-10-17 18:52:52,255 saving best model
|
103 |
+
2023-10-17 18:52:52,649 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 18:52:58,275 epoch 3 - iter 72/723 - loss 0.06108376 - time (sec): 5.62 - samples/sec: 3077.41 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 18:53:03,076 epoch 3 - iter 144/723 - loss 0.06229454 - time (sec): 10.43 - samples/sec: 3255.95 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 18:53:08,621 epoch 3 - iter 216/723 - loss 0.06269040 - time (sec): 15.97 - samples/sec: 3261.71 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 18:53:14,059 epoch 3 - iter 288/723 - loss 0.05782452 - time (sec): 21.41 - samples/sec: 3273.63 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 18:53:19,376 epoch 3 - iter 360/723 - loss 0.05776341 - time (sec): 26.73 - samples/sec: 3305.07 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 18:53:24,787 epoch 3 - iter 432/723 - loss 0.06112370 - time (sec): 32.14 - samples/sec: 3288.95 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 18:53:29,871 epoch 3 - iter 504/723 - loss 0.06276585 - time (sec): 37.22 - samples/sec: 3302.02 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 18:53:35,301 epoch 3 - iter 576/723 - loss 0.06100180 - time (sec): 42.65 - samples/sec: 3318.70 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 18:53:40,357 epoch 3 - iter 648/723 - loss 0.06087244 - time (sec): 47.71 - samples/sec: 3323.88 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 18:53:45,595 epoch 3 - iter 720/723 - loss 0.06067803 - time (sec): 52.94 - samples/sec: 3316.47 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 18:53:45,776 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 18:53:45,777 EPOCH 3 done: loss 0.0606 - lr: 0.000023
|
116 |
+
2023-10-17 18:53:48,994 DEV : loss 0.061015695333480835 - f1-score (micro avg) 0.8829
|
117 |
+
2023-10-17 18:53:49,010 saving best model
|
118 |
+
2023-10-17 18:53:49,436 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 18:53:54,717 epoch 4 - iter 72/723 - loss 0.04387038 - time (sec): 5.28 - samples/sec: 3489.64 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 18:54:00,075 epoch 4 - iter 144/723 - loss 0.03977768 - time (sec): 10.64 - samples/sec: 3434.68 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 18:54:05,046 epoch 4 - iter 216/723 - loss 0.04174055 - time (sec): 15.61 - samples/sec: 3394.76 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 18:54:10,454 epoch 4 - iter 288/723 - loss 0.04295155 - time (sec): 21.02 - samples/sec: 3371.28 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 18:54:15,352 epoch 4 - iter 360/723 - loss 0.04184529 - time (sec): 25.92 - samples/sec: 3370.25 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 18:54:20,627 epoch 4 - iter 432/723 - loss 0.04216528 - time (sec): 31.19 - samples/sec: 3358.32 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 18:54:25,677 epoch 4 - iter 504/723 - loss 0.04174425 - time (sec): 36.24 - samples/sec: 3385.08 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 18:54:31,093 epoch 4 - iter 576/723 - loss 0.04256262 - time (sec): 41.66 - samples/sec: 3366.49 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 18:54:36,301 epoch 4 - iter 648/723 - loss 0.04255295 - time (sec): 46.86 - samples/sec: 3360.15 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 18:54:41,551 epoch 4 - iter 720/723 - loss 0.04326055 - time (sec): 52.11 - samples/sec: 3372.14 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 18:54:41,713 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 18:54:41,713 EPOCH 4 done: loss 0.0432 - lr: 0.000020
|
131 |
+
2023-10-17 18:54:45,293 DEV : loss 0.07059507817029953 - f1-score (micro avg) 0.8652
|
132 |
+
2023-10-17 18:54:45,309 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-17 18:54:50,582 epoch 5 - iter 72/723 - loss 0.03765946 - time (sec): 5.27 - samples/sec: 3200.52 - lr: 0.000020 - momentum: 0.000000
|
134 |
+
2023-10-17 18:54:55,435 epoch 5 - iter 144/723 - loss 0.03376893 - time (sec): 10.12 - samples/sec: 3269.32 - lr: 0.000019 - momentum: 0.000000
|
135 |
+
2023-10-17 18:55:01,381 epoch 5 - iter 216/723 - loss 0.03444213 - time (sec): 16.07 - samples/sec: 3259.98 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 18:55:06,439 epoch 5 - iter 288/723 - loss 0.03201450 - time (sec): 21.13 - samples/sec: 3278.94 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 18:55:11,837 epoch 5 - iter 360/723 - loss 0.03021248 - time (sec): 26.53 - samples/sec: 3268.82 - lr: 0.000018 - momentum: 0.000000
|
138 |
+
2023-10-17 18:55:17,061 epoch 5 - iter 432/723 - loss 0.03095066 - time (sec): 31.75 - samples/sec: 3295.73 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 18:55:22,311 epoch 5 - iter 504/723 - loss 0.03205508 - time (sec): 37.00 - samples/sec: 3318.65 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 18:55:27,501 epoch 5 - iter 576/723 - loss 0.03251336 - time (sec): 42.19 - samples/sec: 3325.42 - lr: 0.000017 - momentum: 0.000000
|
141 |
+
2023-10-17 18:55:32,561 epoch 5 - iter 648/723 - loss 0.03276588 - time (sec): 47.25 - samples/sec: 3329.33 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 18:55:37,985 epoch 5 - iter 720/723 - loss 0.03238438 - time (sec): 52.67 - samples/sec: 3338.39 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 18:55:38,138 ----------------------------------------------------------------------------------------------------
|
144 |
+
2023-10-17 18:55:38,139 EPOCH 5 done: loss 0.0324 - lr: 0.000017
|
145 |
+
2023-10-17 18:55:41,451 DEV : loss 0.07911184430122375 - f1-score (micro avg) 0.8697
|
146 |
+
2023-10-17 18:55:41,469 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-17 18:55:46,855 epoch 6 - iter 72/723 - loss 0.01939646 - time (sec): 5.38 - samples/sec: 3381.89 - lr: 0.000016 - momentum: 0.000000
|
148 |
+
2023-10-17 18:55:52,097 epoch 6 - iter 144/723 - loss 0.02202295 - time (sec): 10.63 - samples/sec: 3376.79 - lr: 0.000016 - momentum: 0.000000
|
149 |
+
2023-10-17 18:55:57,312 epoch 6 - iter 216/723 - loss 0.02329081 - time (sec): 15.84 - samples/sec: 3385.10 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 18:56:03,153 epoch 6 - iter 288/723 - loss 0.02709437 - time (sec): 21.68 - samples/sec: 3284.61 - lr: 0.000015 - momentum: 0.000000
|
151 |
+
2023-10-17 18:56:08,569 epoch 6 - iter 360/723 - loss 0.02852147 - time (sec): 27.10 - samples/sec: 3309.49 - lr: 0.000015 - momentum: 0.000000
|
152 |
+
2023-10-17 18:56:13,824 epoch 6 - iter 432/723 - loss 0.02704669 - time (sec): 32.35 - samples/sec: 3322.19 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 18:56:18,846 epoch 6 - iter 504/723 - loss 0.02698341 - time (sec): 37.38 - samples/sec: 3337.29 - lr: 0.000014 - momentum: 0.000000
|
154 |
+
2023-10-17 18:56:23,707 epoch 6 - iter 576/723 - loss 0.02713054 - time (sec): 42.24 - samples/sec: 3344.91 - lr: 0.000014 - momentum: 0.000000
|
155 |
+
2023-10-17 18:56:28,856 epoch 6 - iter 648/723 - loss 0.02638017 - time (sec): 47.38 - samples/sec: 3348.51 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 18:56:33,896 epoch 6 - iter 720/723 - loss 0.02633353 - time (sec): 52.43 - samples/sec: 3352.29 - lr: 0.000013 - momentum: 0.000000
|
157 |
+
2023-10-17 18:56:34,069 ----------------------------------------------------------------------------------------------------
|
158 |
+
2023-10-17 18:56:34,069 EPOCH 6 done: loss 0.0263 - lr: 0.000013
|
159 |
+
2023-10-17 18:56:37,242 DEV : loss 0.08742444217205048 - f1-score (micro avg) 0.8809
|
160 |
+
2023-10-17 18:56:37,259 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-17 18:56:42,537 epoch 7 - iter 72/723 - loss 0.01010414 - time (sec): 5.28 - samples/sec: 3350.03 - lr: 0.000013 - momentum: 0.000000
|
162 |
+
2023-10-17 18:56:47,589 epoch 7 - iter 144/723 - loss 0.01933338 - time (sec): 10.33 - samples/sec: 3321.61 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-17 18:56:53,263 epoch 7 - iter 216/723 - loss 0.01860946 - time (sec): 16.00 - samples/sec: 3316.41 - lr: 0.000012 - momentum: 0.000000
|
164 |
+
2023-10-17 18:56:58,767 epoch 7 - iter 288/723 - loss 0.01987477 - time (sec): 21.51 - samples/sec: 3329.15 - lr: 0.000012 - momentum: 0.000000
|
165 |
+
2023-10-17 18:57:04,192 epoch 7 - iter 360/723 - loss 0.01978143 - time (sec): 26.93 - samples/sec: 3325.96 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-17 18:57:09,642 epoch 7 - iter 432/723 - loss 0.02023814 - time (sec): 32.38 - samples/sec: 3310.10 - lr: 0.000011 - momentum: 0.000000
|
167 |
+
2023-10-17 18:57:14,799 epoch 7 - iter 504/723 - loss 0.01949429 - time (sec): 37.54 - samples/sec: 3315.96 - lr: 0.000011 - momentum: 0.000000
|
168 |
+
2023-10-17 18:57:19,826 epoch 7 - iter 576/723 - loss 0.01865648 - time (sec): 42.57 - samples/sec: 3327.50 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-17 18:57:24,812 epoch 7 - iter 648/723 - loss 0.01853140 - time (sec): 47.55 - samples/sec: 3333.24 - lr: 0.000010 - momentum: 0.000000
|
170 |
+
2023-10-17 18:57:30,046 epoch 7 - iter 720/723 - loss 0.01839669 - time (sec): 52.79 - samples/sec: 3328.86 - lr: 0.000010 - momentum: 0.000000
|
171 |
+
2023-10-17 18:57:30,201 ----------------------------------------------------------------------------------------------------
|
172 |
+
2023-10-17 18:57:30,201 EPOCH 7 done: loss 0.0184 - lr: 0.000010
|
173 |
+
2023-10-17 18:57:33,757 DEV : loss 0.10578546673059464 - f1-score (micro avg) 0.8809
|
174 |
+
2023-10-17 18:57:33,774 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 18:57:38,932 epoch 8 - iter 72/723 - loss 0.00841995 - time (sec): 5.16 - samples/sec: 3443.43 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-17 18:57:44,074 epoch 8 - iter 144/723 - loss 0.01229420 - time (sec): 10.30 - samples/sec: 3426.89 - lr: 0.000009 - momentum: 0.000000
|
177 |
+
2023-10-17 18:57:49,096 epoch 8 - iter 216/723 - loss 0.01355410 - time (sec): 15.32 - samples/sec: 3395.51 - lr: 0.000009 - momentum: 0.000000
|
178 |
+
2023-10-17 18:57:54,305 epoch 8 - iter 288/723 - loss 0.01394882 - time (sec): 20.53 - samples/sec: 3385.90 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-17 18:57:59,323 epoch 8 - iter 360/723 - loss 0.01323447 - time (sec): 25.55 - samples/sec: 3374.12 - lr: 0.000008 - momentum: 0.000000
|
180 |
+
2023-10-17 18:58:04,498 epoch 8 - iter 432/723 - loss 0.01270974 - time (sec): 30.72 - samples/sec: 3377.16 - lr: 0.000008 - momentum: 0.000000
|
181 |
+
2023-10-17 18:58:09,812 epoch 8 - iter 504/723 - loss 0.01255571 - time (sec): 36.04 - samples/sec: 3355.95 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-17 18:58:15,497 epoch 8 - iter 576/723 - loss 0.01341256 - time (sec): 41.72 - samples/sec: 3360.99 - lr: 0.000007 - momentum: 0.000000
|
183 |
+
2023-10-17 18:58:20,693 epoch 8 - iter 648/723 - loss 0.01371160 - time (sec): 46.92 - samples/sec: 3357.01 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2023-10-17 18:58:26,278 epoch 8 - iter 720/723 - loss 0.01394702 - time (sec): 52.50 - samples/sec: 3344.06 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-17 18:58:26,474 ----------------------------------------------------------------------------------------------------
|
186 |
+
2023-10-17 18:58:26,474 EPOCH 8 done: loss 0.0139 - lr: 0.000007
|
187 |
+
2023-10-17 18:58:29,679 DEV : loss 0.11371435225009918 - f1-score (micro avg) 0.8805
|
188 |
+
2023-10-17 18:58:29,695 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 18:58:35,055 epoch 9 - iter 72/723 - loss 0.01003235 - time (sec): 5.36 - samples/sec: 3286.54 - lr: 0.000006 - momentum: 0.000000
|
190 |
+
2023-10-17 18:58:40,336 epoch 9 - iter 144/723 - loss 0.00981857 - time (sec): 10.64 - samples/sec: 3403.11 - lr: 0.000006 - momentum: 0.000000
|
191 |
+
2023-10-17 18:58:45,118 epoch 9 - iter 216/723 - loss 0.01076096 - time (sec): 15.42 - samples/sec: 3432.99 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-17 18:58:50,044 epoch 9 - iter 288/723 - loss 0.01027158 - time (sec): 20.35 - samples/sec: 3464.18 - lr: 0.000005 - momentum: 0.000000
|
193 |
+
2023-10-17 18:58:55,521 epoch 9 - iter 360/723 - loss 0.00989332 - time (sec): 25.82 - samples/sec: 3429.30 - lr: 0.000005 - momentum: 0.000000
|
194 |
+
2023-10-17 18:59:00,544 epoch 9 - iter 432/723 - loss 0.00992623 - time (sec): 30.85 - samples/sec: 3432.33 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-17 18:59:06,392 epoch 9 - iter 504/723 - loss 0.01080183 - time (sec): 36.70 - samples/sec: 3398.02 - lr: 0.000004 - momentum: 0.000000
|
196 |
+
2023-10-17 18:59:11,482 epoch 9 - iter 576/723 - loss 0.01069467 - time (sec): 41.79 - samples/sec: 3391.20 - lr: 0.000004 - momentum: 0.000000
|
197 |
+
2023-10-17 18:59:16,665 epoch 9 - iter 648/723 - loss 0.01087289 - time (sec): 46.97 - samples/sec: 3403.56 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-17 18:59:21,382 epoch 9 - iter 720/723 - loss 0.01170845 - time (sec): 51.69 - samples/sec: 3401.34 - lr: 0.000003 - momentum: 0.000000
|
199 |
+
2023-10-17 18:59:21,537 ----------------------------------------------------------------------------------------------------
|
200 |
+
2023-10-17 18:59:21,537 EPOCH 9 done: loss 0.0117 - lr: 0.000003
|
201 |
+
2023-10-17 18:59:24,756 DEV : loss 0.11608566343784332 - f1-score (micro avg) 0.8813
|
202 |
+
2023-10-17 18:59:24,773 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-17 18:59:30,213 epoch 10 - iter 72/723 - loss 0.01498393 - time (sec): 5.44 - samples/sec: 3305.97 - lr: 0.000003 - momentum: 0.000000
|
204 |
+
2023-10-17 18:59:35,092 epoch 10 - iter 144/723 - loss 0.00950962 - time (sec): 10.32 - samples/sec: 3395.52 - lr: 0.000003 - momentum: 0.000000
|
205 |
+
2023-10-17 18:59:40,454 epoch 10 - iter 216/723 - loss 0.00882272 - time (sec): 15.68 - samples/sec: 3373.59 - lr: 0.000002 - momentum: 0.000000
|
206 |
+
2023-10-17 18:59:45,853 epoch 10 - iter 288/723 - loss 0.00879424 - time (sec): 21.08 - samples/sec: 3350.52 - lr: 0.000002 - momentum: 0.000000
|
207 |
+
2023-10-17 18:59:50,927 epoch 10 - iter 360/723 - loss 0.00903110 - time (sec): 26.15 - samples/sec: 3359.20 - lr: 0.000002 - momentum: 0.000000
|
208 |
+
2023-10-17 18:59:56,386 epoch 10 - iter 432/723 - loss 0.00841109 - time (sec): 31.61 - samples/sec: 3355.32 - lr: 0.000001 - momentum: 0.000000
|
209 |
+
2023-10-17 19:00:01,775 epoch 10 - iter 504/723 - loss 0.00806376 - time (sec): 37.00 - samples/sec: 3325.43 - lr: 0.000001 - momentum: 0.000000
|
210 |
+
2023-10-17 19:00:06,842 epoch 10 - iter 576/723 - loss 0.00790310 - time (sec): 42.07 - samples/sec: 3324.90 - lr: 0.000001 - momentum: 0.000000
|
211 |
+
2023-10-17 19:00:12,025 epoch 10 - iter 648/723 - loss 0.00804525 - time (sec): 47.25 - samples/sec: 3339.23 - lr: 0.000000 - momentum: 0.000000
|
212 |
+
2023-10-17 19:00:17,383 epoch 10 - iter 720/723 - loss 0.00796014 - time (sec): 52.61 - samples/sec: 3342.17 - lr: 0.000000 - momentum: 0.000000
|
213 |
+
2023-10-17 19:00:17,532 ----------------------------------------------------------------------------------------------------
|
214 |
+
2023-10-17 19:00:17,533 EPOCH 10 done: loss 0.0079 - lr: 0.000000
|
215 |
+
2023-10-17 19:00:21,687 DEV : loss 0.12145841866731644 - f1-score (micro avg) 0.8792
|
216 |
+
2023-10-17 19:00:22,119 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-17 19:00:22,121 Loading model from best epoch ...
|
218 |
+
2023-10-17 19:00:23,863 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
219 |
+
2023-10-17 19:00:27,596
|
220 |
+
Results:
|
221 |
+
- F-score (micro) 0.8643
|
222 |
+
- F-score (macro) 0.7231
|
223 |
+
- Accuracy 0.7673
|
224 |
+
|
225 |
+
By class:
|
226 |
+
precision recall f1-score support
|
227 |
+
|
228 |
+
PER 0.8669 0.8651 0.8660 482
|
229 |
+
LOC 0.9509 0.8886 0.9187 458
|
230 |
+
ORG 0.5714 0.2899 0.3846 69
|
231 |
+
|
232 |
+
micro avg 0.8941 0.8365 0.8643 1009
|
233 |
+
macro avg 0.7964 0.6812 0.7231 1009
|
234 |
+
weighted avg 0.8849 0.8365 0.8570 1009
|
235 |
+
|
236 |
+
2023-10-17 19:00:27,596 ----------------------------------------------------------------------------------------------------
|