Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697544444.bce904bcef33.2023.6 +3 -0
- test.tsv +0 -0
- training.log +236 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0f0daf5e3114c34bd33bdf35ab7b87ffcb651eb8c56e4095458ac9b3954fc147
|
3 |
+
size 440941957
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 12:08:30 0.0000 0.4767 0.0905 0.6822 0.7285 0.7046 0.5615
|
3 |
+
2 12:09:34 0.0000 0.1058 0.0820 0.7181 0.7523 0.7348 0.5932
|
4 |
+
3 12:10:42 0.0000 0.0731 0.0863 0.7350 0.8032 0.7676 0.6437
|
5 |
+
4 12:11:48 0.0000 0.0519 0.1446 0.7469 0.7613 0.7541 0.6220
|
6 |
+
5 12:12:52 0.0000 0.0400 0.1545 0.7483 0.7432 0.7457 0.6117
|
7 |
+
6 12:13:55 0.0000 0.0308 0.1691 0.7376 0.7760 0.7563 0.6248
|
8 |
+
7 12:15:00 0.0000 0.0239 0.1992 0.7354 0.7828 0.7584 0.6245
|
9 |
+
8 12:16:05 0.0000 0.0171 0.2246 0.7427 0.7771 0.7595 0.6263
|
10 |
+
9 12:17:10 0.0000 0.0137 0.2310 0.7360 0.7885 0.7613 0.6291
|
11 |
+
10 12:18:15 0.0000 0.0106 0.2380 0.7522 0.7828 0.7672 0.6343
|
runs/events.out.tfevents.1697544444.bce904bcef33.2023.6
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2719f516747696650fd77662bb688cc9608316a0d8a737783c4203219511c078
|
3 |
+
size 556612
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,236 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 12:07:24,738 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 12:07:24,739 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 12:07:24,739 MultiCorpus: 7936 train + 992 dev + 992 test sentences
|
48 |
+
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
|
49 |
+
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 12:07:24,739 Train: 7936 sentences
|
51 |
+
2023-10-17 12:07:24,739 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 12:07:24,739 Training Params:
|
54 |
+
2023-10-17 12:07:24,739 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 12:07:24,739 - mini_batch_size: "8"
|
56 |
+
2023-10-17 12:07:24,739 - max_epochs: "10"
|
57 |
+
2023-10-17 12:07:24,739 - shuffle: "True"
|
58 |
+
2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 12:07:24,739 Plugins:
|
60 |
+
2023-10-17 12:07:24,739 - TensorboardLogger
|
61 |
+
2023-10-17 12:07:24,739 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 12:07:24,740 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 12:07:24,740 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 12:07:24,740 Computation:
|
67 |
+
2023-10-17 12:07:24,740 - compute on device: cuda:0
|
68 |
+
2023-10-17 12:07:24,740 - embedding storage: none
|
69 |
+
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 12:07:24,740 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
|
71 |
+
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 12:07:24,740 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 12:07:30,633 epoch 1 - iter 99/992 - loss 2.81599692 - time (sec): 5.89 - samples/sec: 2704.36 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 12:07:37,189 epoch 1 - iter 198/992 - loss 1.62981031 - time (sec): 12.45 - samples/sec: 2621.75 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 12:07:43,470 epoch 1 - iter 297/992 - loss 1.19195552 - time (sec): 18.73 - samples/sec: 2606.95 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 12:07:49,903 epoch 1 - iter 396/992 - loss 0.94531236 - time (sec): 25.16 - samples/sec: 2599.13 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 12:07:56,328 epoch 1 - iter 495/992 - loss 0.78791628 - time (sec): 31.59 - samples/sec: 2619.30 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 12:08:02,769 epoch 1 - iter 594/992 - loss 0.69398148 - time (sec): 38.03 - samples/sec: 2604.33 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 12:08:09,093 epoch 1 - iter 693/992 - loss 0.61610072 - time (sec): 44.35 - samples/sec: 2617.66 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 12:08:15,034 epoch 1 - iter 792/992 - loss 0.56155201 - time (sec): 50.29 - samples/sec: 2618.04 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 12:08:20,806 epoch 1 - iter 891/992 - loss 0.51541319 - time (sec): 56.06 - samples/sec: 2632.79 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 12:08:26,516 epoch 1 - iter 990/992 - loss 0.47731115 - time (sec): 61.78 - samples/sec: 2650.78 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 12:08:26,615 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 12:08:26,615 EPOCH 1 done: loss 0.4767 - lr: 0.000030
|
86 |
+
2023-10-17 12:08:29,992 DEV : loss 0.09053196012973785 - f1-score (micro avg) 0.7046
|
87 |
+
2023-10-17 12:08:30,020 saving best model
|
88 |
+
2023-10-17 12:08:30,478 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 12:08:36,955 epoch 2 - iter 99/992 - loss 0.12503785 - time (sec): 6.47 - samples/sec: 2406.01 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 12:08:42,937 epoch 2 - iter 198/992 - loss 0.11258958 - time (sec): 12.46 - samples/sec: 2541.06 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 12:08:48,784 epoch 2 - iter 297/992 - loss 0.11476512 - time (sec): 18.30 - samples/sec: 2635.49 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 12:08:54,675 epoch 2 - iter 396/992 - loss 0.11579427 - time (sec): 24.19 - samples/sec: 2669.33 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 12:09:00,719 epoch 2 - iter 495/992 - loss 0.11485782 - time (sec): 30.24 - samples/sec: 2701.86 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 12:09:06,415 epoch 2 - iter 594/992 - loss 0.11334714 - time (sec): 35.93 - samples/sec: 2722.81 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 12:09:12,303 epoch 2 - iter 693/992 - loss 0.10946511 - time (sec): 41.82 - samples/sec: 2733.25 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 12:09:18,263 epoch 2 - iter 792/992 - loss 0.10780487 - time (sec): 47.78 - samples/sec: 2750.25 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 12:09:24,067 epoch 2 - iter 891/992 - loss 0.10649763 - time (sec): 53.59 - samples/sec: 2759.92 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 12:09:29,795 epoch 2 - iter 990/992 - loss 0.10582706 - time (sec): 59.31 - samples/sec: 2760.34 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 12:09:29,910 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 12:09:29,910 EPOCH 2 done: loss 0.1058 - lr: 0.000027
|
101 |
+
2023-10-17 12:09:34,536 DEV : loss 0.08198774605989456 - f1-score (micro avg) 0.7348
|
102 |
+
2023-10-17 12:09:34,563 saving best model
|
103 |
+
2023-10-17 12:09:35,254 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 12:09:41,960 epoch 3 - iter 99/992 - loss 0.07989941 - time (sec): 6.70 - samples/sec: 2509.07 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 12:09:48,280 epoch 3 - iter 198/992 - loss 0.07549915 - time (sec): 13.02 - samples/sec: 2531.43 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 12:09:54,730 epoch 3 - iter 297/992 - loss 0.07291906 - time (sec): 19.47 - samples/sec: 2557.87 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 12:10:00,729 epoch 3 - iter 396/992 - loss 0.07510826 - time (sec): 25.47 - samples/sec: 2542.56 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 12:10:06,998 epoch 3 - iter 495/992 - loss 0.07247563 - time (sec): 31.74 - samples/sec: 2560.55 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 12:10:13,353 epoch 3 - iter 594/992 - loss 0.07191519 - time (sec): 38.10 - samples/sec: 2559.42 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 12:10:20,186 epoch 3 - iter 693/992 - loss 0.07226050 - time (sec): 44.93 - samples/sec: 2567.25 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 12:10:26,714 epoch 3 - iter 792/992 - loss 0.07266178 - time (sec): 51.46 - samples/sec: 2563.10 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 12:10:32,557 epoch 3 - iter 891/992 - loss 0.07256785 - time (sec): 57.30 - samples/sec: 2570.93 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 12:10:38,603 epoch 3 - iter 990/992 - loss 0.07318580 - time (sec): 63.35 - samples/sec: 2583.60 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 12:10:38,731 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 12:10:38,732 EPOCH 3 done: loss 0.0731 - lr: 0.000023
|
116 |
+
2023-10-17 12:10:42,404 DEV : loss 0.08630654215812683 - f1-score (micro avg) 0.7676
|
117 |
+
2023-10-17 12:10:42,428 saving best model
|
118 |
+
2023-10-17 12:10:42,965 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 12:10:49,431 epoch 4 - iter 99/992 - loss 0.04416682 - time (sec): 6.46 - samples/sec: 2587.20 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 12:10:55,552 epoch 4 - iter 198/992 - loss 0.05047964 - time (sec): 12.58 - samples/sec: 2567.57 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 12:11:01,648 epoch 4 - iter 297/992 - loss 0.05348240 - time (sec): 18.68 - samples/sec: 2544.86 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 12:11:08,069 epoch 4 - iter 396/992 - loss 0.05389510 - time (sec): 25.10 - samples/sec: 2573.75 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 12:11:14,268 epoch 4 - iter 495/992 - loss 0.05314738 - time (sec): 31.30 - samples/sec: 2590.35 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 12:11:20,252 epoch 4 - iter 594/992 - loss 0.05290348 - time (sec): 37.28 - samples/sec: 2600.92 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 12:11:26,215 epoch 4 - iter 693/992 - loss 0.05269183 - time (sec): 43.25 - samples/sec: 2627.62 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 12:11:32,343 epoch 4 - iter 792/992 - loss 0.05275766 - time (sec): 49.37 - samples/sec: 2642.76 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 12:11:38,671 epoch 4 - iter 891/992 - loss 0.05126323 - time (sec): 55.70 - samples/sec: 2648.76 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 12:11:44,771 epoch 4 - iter 990/992 - loss 0.05192802 - time (sec): 61.80 - samples/sec: 2648.78 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 12:11:44,890 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 12:11:44,890 EPOCH 4 done: loss 0.0519 - lr: 0.000020
|
131 |
+
2023-10-17 12:11:48,459 DEV : loss 0.14457282423973083 - f1-score (micro avg) 0.7541
|
132 |
+
2023-10-17 12:11:48,482 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-17 12:11:54,656 epoch 5 - iter 99/992 - loss 0.04255254 - time (sec): 6.17 - samples/sec: 2746.18 - lr: 0.000020 - momentum: 0.000000
|
134 |
+
2023-10-17 12:12:00,775 epoch 5 - iter 198/992 - loss 0.03721103 - time (sec): 12.29 - samples/sec: 2715.66 - lr: 0.000019 - momentum: 0.000000
|
135 |
+
2023-10-17 12:12:06,602 epoch 5 - iter 297/992 - loss 0.03770170 - time (sec): 18.12 - samples/sec: 2739.09 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 12:12:12,537 epoch 5 - iter 396/992 - loss 0.04002274 - time (sec): 24.05 - samples/sec: 2729.65 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 12:12:18,393 epoch 5 - iter 495/992 - loss 0.04097590 - time (sec): 29.91 - samples/sec: 2731.15 - lr: 0.000018 - momentum: 0.000000
|
138 |
+
2023-10-17 12:12:24,013 epoch 5 - iter 594/992 - loss 0.04084997 - time (sec): 35.53 - samples/sec: 2731.78 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 12:12:30,275 epoch 5 - iter 693/992 - loss 0.04184421 - time (sec): 41.79 - samples/sec: 2728.74 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 12:12:36,500 epoch 5 - iter 792/992 - loss 0.04039337 - time (sec): 48.02 - samples/sec: 2724.95 - lr: 0.000017 - momentum: 0.000000
|
141 |
+
2023-10-17 12:12:42,422 epoch 5 - iter 891/992 - loss 0.04016748 - time (sec): 53.94 - samples/sec: 2723.89 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 12:12:48,687 epoch 5 - iter 990/992 - loss 0.04010609 - time (sec): 60.20 - samples/sec: 2718.44 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 12:12:48,815 ----------------------------------------------------------------------------------------------------
|
144 |
+
2023-10-17 12:12:48,815 EPOCH 5 done: loss 0.0400 - lr: 0.000017
|
145 |
+
2023-10-17 12:12:52,461 DEV : loss 0.1545393168926239 - f1-score (micro avg) 0.7457
|
146 |
+
2023-10-17 12:12:52,486 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-17 12:12:58,240 epoch 6 - iter 99/992 - loss 0.03053762 - time (sec): 5.75 - samples/sec: 2808.37 - lr: 0.000016 - momentum: 0.000000
|
148 |
+
2023-10-17 12:13:04,393 epoch 6 - iter 198/992 - loss 0.03077824 - time (sec): 11.91 - samples/sec: 2752.27 - lr: 0.000016 - momentum: 0.000000
|
149 |
+
2023-10-17 12:13:10,152 epoch 6 - iter 297/992 - loss 0.03119674 - time (sec): 17.66 - samples/sec: 2751.63 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 12:13:16,163 epoch 6 - iter 396/992 - loss 0.03144636 - time (sec): 23.68 - samples/sec: 2771.61 - lr: 0.000015 - momentum: 0.000000
|
151 |
+
2023-10-17 12:13:22,454 epoch 6 - iter 495/992 - loss 0.03099838 - time (sec): 29.97 - samples/sec: 2759.57 - lr: 0.000015 - momentum: 0.000000
|
152 |
+
2023-10-17 12:13:28,307 epoch 6 - iter 594/992 - loss 0.03087116 - time (sec): 35.82 - samples/sec: 2772.51 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 12:13:34,052 epoch 6 - iter 693/992 - loss 0.03073382 - time (sec): 41.56 - samples/sec: 2759.70 - lr: 0.000014 - momentum: 0.000000
|
154 |
+
2023-10-17 12:13:39,994 epoch 6 - iter 792/992 - loss 0.03046479 - time (sec): 47.51 - samples/sec: 2760.03 - lr: 0.000014 - momentum: 0.000000
|
155 |
+
2023-10-17 12:13:46,007 epoch 6 - iter 891/992 - loss 0.03053390 - time (sec): 53.52 - samples/sec: 2751.13 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 12:13:52,001 epoch 6 - iter 990/992 - loss 0.03080662 - time (sec): 59.51 - samples/sec: 2750.77 - lr: 0.000013 - momentum: 0.000000
|
157 |
+
2023-10-17 12:13:52,137 ----------------------------------------------------------------------------------------------------
|
158 |
+
2023-10-17 12:13:52,138 EPOCH 6 done: loss 0.0308 - lr: 0.000013
|
159 |
+
2023-10-17 12:13:55,802 DEV : loss 0.16906479001045227 - f1-score (micro avg) 0.7563
|
160 |
+
2023-10-17 12:13:55,829 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-17 12:14:02,579 epoch 7 - iter 99/992 - loss 0.03633557 - time (sec): 6.75 - samples/sec: 2451.08 - lr: 0.000013 - momentum: 0.000000
|
162 |
+
2023-10-17 12:14:08,575 epoch 7 - iter 198/992 - loss 0.02602492 - time (sec): 12.74 - samples/sec: 2612.37 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-17 12:14:14,831 epoch 7 - iter 297/992 - loss 0.02587295 - time (sec): 19.00 - samples/sec: 2642.57 - lr: 0.000012 - momentum: 0.000000
|
164 |
+
2023-10-17 12:14:20,871 epoch 7 - iter 396/992 - loss 0.02406875 - time (sec): 25.04 - samples/sec: 2668.88 - lr: 0.000012 - momentum: 0.000000
|
165 |
+
2023-10-17 12:14:26,876 epoch 7 - iter 495/992 - loss 0.02415969 - time (sec): 31.04 - samples/sec: 2694.00 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-17 12:14:32,581 epoch 7 - iter 594/992 - loss 0.02344326 - time (sec): 36.75 - samples/sec: 2705.12 - lr: 0.000011 - momentum: 0.000000
|
167 |
+
2023-10-17 12:14:38,429 epoch 7 - iter 693/992 - loss 0.02349137 - time (sec): 42.60 - samples/sec: 2700.29 - lr: 0.000011 - momentum: 0.000000
|
168 |
+
2023-10-17 12:14:44,666 epoch 7 - iter 792/992 - loss 0.02373935 - time (sec): 48.83 - samples/sec: 2687.24 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-17 12:14:50,524 epoch 7 - iter 891/992 - loss 0.02317591 - time (sec): 54.69 - samples/sec: 2685.02 - lr: 0.000010 - momentum: 0.000000
|
170 |
+
2023-10-17 12:14:56,875 epoch 7 - iter 990/992 - loss 0.02379480 - time (sec): 61.04 - samples/sec: 2681.55 - lr: 0.000010 - momentum: 0.000000
|
171 |
+
2023-10-17 12:14:56,989 ----------------------------------------------------------------------------------------------------
|
172 |
+
2023-10-17 12:14:56,989 EPOCH 7 done: loss 0.0239 - lr: 0.000010
|
173 |
+
2023-10-17 12:15:00,763 DEV : loss 0.19923286139965057 - f1-score (micro avg) 0.7584
|
174 |
+
2023-10-17 12:15:00,789 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 12:15:07,085 epoch 8 - iter 99/992 - loss 0.01766621 - time (sec): 6.29 - samples/sec: 2638.94 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-17 12:15:13,503 epoch 8 - iter 198/992 - loss 0.01948794 - time (sec): 12.71 - samples/sec: 2649.48 - lr: 0.000009 - momentum: 0.000000
|
177 |
+
2023-10-17 12:15:19,541 epoch 8 - iter 297/992 - loss 0.02060868 - time (sec): 18.75 - samples/sec: 2637.99 - lr: 0.000009 - momentum: 0.000000
|
178 |
+
2023-10-17 12:15:25,392 epoch 8 - iter 396/992 - loss 0.02079230 - time (sec): 24.60 - samples/sec: 2664.26 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-17 12:15:31,525 epoch 8 - iter 495/992 - loss 0.01946266 - time (sec): 30.73 - samples/sec: 2681.18 - lr: 0.000008 - momentum: 0.000000
|
180 |
+
2023-10-17 12:15:37,627 epoch 8 - iter 594/992 - loss 0.01905439 - time (sec): 36.84 - samples/sec: 2670.25 - lr: 0.000008 - momentum: 0.000000
|
181 |
+
2023-10-17 12:15:43,462 epoch 8 - iter 693/992 - loss 0.01877632 - time (sec): 42.67 - samples/sec: 2664.42 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-17 12:15:49,740 epoch 8 - iter 792/992 - loss 0.01826040 - time (sec): 48.95 - samples/sec: 2673.89 - lr: 0.000007 - momentum: 0.000000
|
183 |
+
2023-10-17 12:15:55,638 epoch 8 - iter 891/992 - loss 0.01771491 - time (sec): 54.85 - samples/sec: 2679.21 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2023-10-17 12:16:01,869 epoch 8 - iter 990/992 - loss 0.01711630 - time (sec): 61.08 - samples/sec: 2679.87 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-17 12:16:01,999 ----------------------------------------------------------------------------------------------------
|
186 |
+
2023-10-17 12:16:01,999 EPOCH 8 done: loss 0.0171 - lr: 0.000007
|
187 |
+
2023-10-17 12:16:05,753 DEV : loss 0.22457100450992584 - f1-score (micro avg) 0.7595
|
188 |
+
2023-10-17 12:16:05,780 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 12:16:11,813 epoch 9 - iter 99/992 - loss 0.00974185 - time (sec): 6.03 - samples/sec: 2591.16 - lr: 0.000006 - momentum: 0.000000
|
190 |
+
2023-10-17 12:16:17,738 epoch 9 - iter 198/992 - loss 0.01453937 - time (sec): 11.96 - samples/sec: 2666.01 - lr: 0.000006 - momentum: 0.000000
|
191 |
+
2023-10-17 12:16:24,024 epoch 9 - iter 297/992 - loss 0.01576870 - time (sec): 18.24 - samples/sec: 2645.28 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-17 12:16:30,258 epoch 9 - iter 396/992 - loss 0.01526670 - time (sec): 24.48 - samples/sec: 2658.71 - lr: 0.000005 - momentum: 0.000000
|
193 |
+
2023-10-17 12:16:36,310 epoch 9 - iter 495/992 - loss 0.01459426 - time (sec): 30.53 - samples/sec: 2658.40 - lr: 0.000005 - momentum: 0.000000
|
194 |
+
2023-10-17 12:16:42,624 epoch 9 - iter 594/992 - loss 0.01510246 - time (sec): 36.84 - samples/sec: 2670.70 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-17 12:16:48,677 epoch 9 - iter 693/992 - loss 0.01477998 - time (sec): 42.90 - samples/sec: 2684.08 - lr: 0.000004 - momentum: 0.000000
|
196 |
+
2023-10-17 12:16:54,722 epoch 9 - iter 792/992 - loss 0.01392476 - time (sec): 48.94 - samples/sec: 2680.28 - lr: 0.000004 - momentum: 0.000000
|
197 |
+
2023-10-17 12:17:00,939 epoch 9 - iter 891/992 - loss 0.01406745 - time (sec): 55.16 - samples/sec: 2677.23 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-17 12:17:07,036 epoch 9 - iter 990/992 - loss 0.01370319 - time (sec): 61.25 - samples/sec: 2673.04 - lr: 0.000003 - momentum: 0.000000
|
199 |
+
2023-10-17 12:17:07,145 ----------------------------------------------------------------------------------------------------
|
200 |
+
2023-10-17 12:17:07,145 EPOCH 9 done: loss 0.0137 - lr: 0.000003
|
201 |
+
2023-10-17 12:17:10,873 DEV : loss 0.2310420423746109 - f1-score (micro avg) 0.7613
|
202 |
+
2023-10-17 12:17:10,897 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-17 12:17:17,103 epoch 10 - iter 99/992 - loss 0.01103320 - time (sec): 6.20 - samples/sec: 2698.89 - lr: 0.000003 - momentum: 0.000000
|
204 |
+
2023-10-17 12:17:23,243 epoch 10 - iter 198/992 - loss 0.01140684 - time (sec): 12.34 - samples/sec: 2673.86 - lr: 0.000003 - momentum: 0.000000
|
205 |
+
2023-10-17 12:17:29,396 epoch 10 - iter 297/992 - loss 0.01229238 - time (sec): 18.50 - samples/sec: 2699.56 - lr: 0.000002 - momentum: 0.000000
|
206 |
+
2023-10-17 12:17:35,321 epoch 10 - iter 396/992 - loss 0.01136142 - time (sec): 24.42 - samples/sec: 2692.86 - lr: 0.000002 - momentum: 0.000000
|
207 |
+
2023-10-17 12:17:41,585 epoch 10 - iter 495/992 - loss 0.01008313 - time (sec): 30.69 - samples/sec: 2704.21 - lr: 0.000002 - momentum: 0.000000
|
208 |
+
2023-10-17 12:17:47,624 epoch 10 - iter 594/992 - loss 0.01014504 - time (sec): 36.72 - samples/sec: 2718.86 - lr: 0.000001 - momentum: 0.000000
|
209 |
+
2023-10-17 12:17:53,556 epoch 10 - iter 693/992 - loss 0.01023189 - time (sec): 42.66 - samples/sec: 2728.14 - lr: 0.000001 - momentum: 0.000000
|
210 |
+
2023-10-17 12:17:59,429 epoch 10 - iter 792/992 - loss 0.01009792 - time (sec): 48.53 - samples/sec: 2724.82 - lr: 0.000001 - momentum: 0.000000
|
211 |
+
2023-10-17 12:18:05,512 epoch 10 - iter 891/992 - loss 0.01058406 - time (sec): 54.61 - samples/sec: 2706.44 - lr: 0.000000 - momentum: 0.000000
|
212 |
+
2023-10-17 12:18:11,517 epoch 10 - iter 990/992 - loss 0.01066306 - time (sec): 60.62 - samples/sec: 2701.38 - lr: 0.000000 - momentum: 0.000000
|
213 |
+
2023-10-17 12:18:11,627 ----------------------------------------------------------------------------------------------------
|
214 |
+
2023-10-17 12:18:11,628 EPOCH 10 done: loss 0.0106 - lr: 0.000000
|
215 |
+
2023-10-17 12:18:15,865 DEV : loss 0.23796458542346954 - f1-score (micro avg) 0.7672
|
216 |
+
2023-10-17 12:18:16,334 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-17 12:18:16,336 Loading model from best epoch ...
|
218 |
+
2023-10-17 12:18:17,913 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
|
219 |
+
2023-10-17 12:18:21,757
|
220 |
+
Results:
|
221 |
+
- F-score (micro) 0.7771
|
222 |
+
- F-score (macro) 0.697
|
223 |
+
- Accuracy 0.6629
|
224 |
+
|
225 |
+
By class:
|
226 |
+
precision recall f1-score support
|
227 |
+
|
228 |
+
LOC 0.8358 0.8626 0.8490 655
|
229 |
+
PER 0.6897 0.8072 0.7438 223
|
230 |
+
ORG 0.4494 0.5591 0.4982 127
|
231 |
+
|
232 |
+
micro avg 0.7452 0.8119 0.7771 1005
|
233 |
+
macro avg 0.6583 0.7429 0.6970 1005
|
234 |
+
weighted avg 0.7545 0.8119 0.7813 1005
|
235 |
+
|
236 |
+
2023-10-17 12:18:21,758 ----------------------------------------------------------------------------------------------------
|