stefan-it commited on
Commit
fdf3c3d
·
1 Parent(s): 18bfecb

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f0daf5e3114c34bd33bdf35ab7b87ffcb651eb8c56e4095458ac9b3954fc147
3
+ size 440941957
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 12:08:30 0.0000 0.4767 0.0905 0.6822 0.7285 0.7046 0.5615
3
+ 2 12:09:34 0.0000 0.1058 0.0820 0.7181 0.7523 0.7348 0.5932
4
+ 3 12:10:42 0.0000 0.0731 0.0863 0.7350 0.8032 0.7676 0.6437
5
+ 4 12:11:48 0.0000 0.0519 0.1446 0.7469 0.7613 0.7541 0.6220
6
+ 5 12:12:52 0.0000 0.0400 0.1545 0.7483 0.7432 0.7457 0.6117
7
+ 6 12:13:55 0.0000 0.0308 0.1691 0.7376 0.7760 0.7563 0.6248
8
+ 7 12:15:00 0.0000 0.0239 0.1992 0.7354 0.7828 0.7584 0.6245
9
+ 8 12:16:05 0.0000 0.0171 0.2246 0.7427 0.7771 0.7595 0.6263
10
+ 9 12:17:10 0.0000 0.0137 0.2310 0.7360 0.7885 0.7613 0.6291
11
+ 10 12:18:15 0.0000 0.0106 0.2380 0.7522 0.7828 0.7672 0.6343
runs/events.out.tfevents.1697544444.bce904bcef33.2023.6 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2719f516747696650fd77662bb688cc9608316a0d8a737783c4203219511c078
3
+ size 556612
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,236 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 12:07:24,738 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 12:07:24,739 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=13, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 12:07:24,739 MultiCorpus: 7936 train + 992 dev + 992 test sentences
48
+ - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
49
+ 2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 12:07:24,739 Train: 7936 sentences
51
+ 2023-10-17 12:07:24,739 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 12:07:24,739 Training Params:
54
+ 2023-10-17 12:07:24,739 - learning_rate: "3e-05"
55
+ 2023-10-17 12:07:24,739 - mini_batch_size: "8"
56
+ 2023-10-17 12:07:24,739 - max_epochs: "10"
57
+ 2023-10-17 12:07:24,739 - shuffle: "True"
58
+ 2023-10-17 12:07:24,739 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 12:07:24,739 Plugins:
60
+ 2023-10-17 12:07:24,739 - TensorboardLogger
61
+ 2023-10-17 12:07:24,739 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 12:07:24,740 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 12:07:24,740 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 12:07:24,740 Computation:
67
+ 2023-10-17 12:07:24,740 - compute on device: cuda:0
68
+ 2023-10-17 12:07:24,740 - embedding storage: none
69
+ 2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 12:07:24,740 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
71
+ 2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 12:07:24,740 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 12:07:24,740 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 12:07:30,633 epoch 1 - iter 99/992 - loss 2.81599692 - time (sec): 5.89 - samples/sec: 2704.36 - lr: 0.000003 - momentum: 0.000000
75
+ 2023-10-17 12:07:37,189 epoch 1 - iter 198/992 - loss 1.62981031 - time (sec): 12.45 - samples/sec: 2621.75 - lr: 0.000006 - momentum: 0.000000
76
+ 2023-10-17 12:07:43,470 epoch 1 - iter 297/992 - loss 1.19195552 - time (sec): 18.73 - samples/sec: 2606.95 - lr: 0.000009 - momentum: 0.000000
77
+ 2023-10-17 12:07:49,903 epoch 1 - iter 396/992 - loss 0.94531236 - time (sec): 25.16 - samples/sec: 2599.13 - lr: 0.000012 - momentum: 0.000000
78
+ 2023-10-17 12:07:56,328 epoch 1 - iter 495/992 - loss 0.78791628 - time (sec): 31.59 - samples/sec: 2619.30 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-17 12:08:02,769 epoch 1 - iter 594/992 - loss 0.69398148 - time (sec): 38.03 - samples/sec: 2604.33 - lr: 0.000018 - momentum: 0.000000
80
+ 2023-10-17 12:08:09,093 epoch 1 - iter 693/992 - loss 0.61610072 - time (sec): 44.35 - samples/sec: 2617.66 - lr: 0.000021 - momentum: 0.000000
81
+ 2023-10-17 12:08:15,034 epoch 1 - iter 792/992 - loss 0.56155201 - time (sec): 50.29 - samples/sec: 2618.04 - lr: 0.000024 - momentum: 0.000000
82
+ 2023-10-17 12:08:20,806 epoch 1 - iter 891/992 - loss 0.51541319 - time (sec): 56.06 - samples/sec: 2632.79 - lr: 0.000027 - momentum: 0.000000
83
+ 2023-10-17 12:08:26,516 epoch 1 - iter 990/992 - loss 0.47731115 - time (sec): 61.78 - samples/sec: 2650.78 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-17 12:08:26,615 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 12:08:26,615 EPOCH 1 done: loss 0.4767 - lr: 0.000030
86
+ 2023-10-17 12:08:29,992 DEV : loss 0.09053196012973785 - f1-score (micro avg) 0.7046
87
+ 2023-10-17 12:08:30,020 saving best model
88
+ 2023-10-17 12:08:30,478 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 12:08:36,955 epoch 2 - iter 99/992 - loss 0.12503785 - time (sec): 6.47 - samples/sec: 2406.01 - lr: 0.000030 - momentum: 0.000000
90
+ 2023-10-17 12:08:42,937 epoch 2 - iter 198/992 - loss 0.11258958 - time (sec): 12.46 - samples/sec: 2541.06 - lr: 0.000029 - momentum: 0.000000
91
+ 2023-10-17 12:08:48,784 epoch 2 - iter 297/992 - loss 0.11476512 - time (sec): 18.30 - samples/sec: 2635.49 - lr: 0.000029 - momentum: 0.000000
92
+ 2023-10-17 12:08:54,675 epoch 2 - iter 396/992 - loss 0.11579427 - time (sec): 24.19 - samples/sec: 2669.33 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-17 12:09:00,719 epoch 2 - iter 495/992 - loss 0.11485782 - time (sec): 30.24 - samples/sec: 2701.86 - lr: 0.000028 - momentum: 0.000000
94
+ 2023-10-17 12:09:06,415 epoch 2 - iter 594/992 - loss 0.11334714 - time (sec): 35.93 - samples/sec: 2722.81 - lr: 0.000028 - momentum: 0.000000
95
+ 2023-10-17 12:09:12,303 epoch 2 - iter 693/992 - loss 0.10946511 - time (sec): 41.82 - samples/sec: 2733.25 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-17 12:09:18,263 epoch 2 - iter 792/992 - loss 0.10780487 - time (sec): 47.78 - samples/sec: 2750.25 - lr: 0.000027 - momentum: 0.000000
97
+ 2023-10-17 12:09:24,067 epoch 2 - iter 891/992 - loss 0.10649763 - time (sec): 53.59 - samples/sec: 2759.92 - lr: 0.000027 - momentum: 0.000000
98
+ 2023-10-17 12:09:29,795 epoch 2 - iter 990/992 - loss 0.10582706 - time (sec): 59.31 - samples/sec: 2760.34 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-17 12:09:29,910 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 12:09:29,910 EPOCH 2 done: loss 0.1058 - lr: 0.000027
101
+ 2023-10-17 12:09:34,536 DEV : loss 0.08198774605989456 - f1-score (micro avg) 0.7348
102
+ 2023-10-17 12:09:34,563 saving best model
103
+ 2023-10-17 12:09:35,254 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 12:09:41,960 epoch 3 - iter 99/992 - loss 0.07989941 - time (sec): 6.70 - samples/sec: 2509.07 - lr: 0.000026 - momentum: 0.000000
105
+ 2023-10-17 12:09:48,280 epoch 3 - iter 198/992 - loss 0.07549915 - time (sec): 13.02 - samples/sec: 2531.43 - lr: 0.000026 - momentum: 0.000000
106
+ 2023-10-17 12:09:54,730 epoch 3 - iter 297/992 - loss 0.07291906 - time (sec): 19.47 - samples/sec: 2557.87 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-17 12:10:00,729 epoch 3 - iter 396/992 - loss 0.07510826 - time (sec): 25.47 - samples/sec: 2542.56 - lr: 0.000025 - momentum: 0.000000
108
+ 2023-10-17 12:10:06,998 epoch 3 - iter 495/992 - loss 0.07247563 - time (sec): 31.74 - samples/sec: 2560.55 - lr: 0.000025 - momentum: 0.000000
109
+ 2023-10-17 12:10:13,353 epoch 3 - iter 594/992 - loss 0.07191519 - time (sec): 38.10 - samples/sec: 2559.42 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-17 12:10:20,186 epoch 3 - iter 693/992 - loss 0.07226050 - time (sec): 44.93 - samples/sec: 2567.25 - lr: 0.000024 - momentum: 0.000000
111
+ 2023-10-17 12:10:26,714 epoch 3 - iter 792/992 - loss 0.07266178 - time (sec): 51.46 - samples/sec: 2563.10 - lr: 0.000024 - momentum: 0.000000
112
+ 2023-10-17 12:10:32,557 epoch 3 - iter 891/992 - loss 0.07256785 - time (sec): 57.30 - samples/sec: 2570.93 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-17 12:10:38,603 epoch 3 - iter 990/992 - loss 0.07318580 - time (sec): 63.35 - samples/sec: 2583.60 - lr: 0.000023 - momentum: 0.000000
114
+ 2023-10-17 12:10:38,731 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 12:10:38,732 EPOCH 3 done: loss 0.0731 - lr: 0.000023
116
+ 2023-10-17 12:10:42,404 DEV : loss 0.08630654215812683 - f1-score (micro avg) 0.7676
117
+ 2023-10-17 12:10:42,428 saving best model
118
+ 2023-10-17 12:10:42,965 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 12:10:49,431 epoch 4 - iter 99/992 - loss 0.04416682 - time (sec): 6.46 - samples/sec: 2587.20 - lr: 0.000023 - momentum: 0.000000
120
+ 2023-10-17 12:10:55,552 epoch 4 - iter 198/992 - loss 0.05047964 - time (sec): 12.58 - samples/sec: 2567.57 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-17 12:11:01,648 epoch 4 - iter 297/992 - loss 0.05348240 - time (sec): 18.68 - samples/sec: 2544.86 - lr: 0.000022 - momentum: 0.000000
122
+ 2023-10-17 12:11:08,069 epoch 4 - iter 396/992 - loss 0.05389510 - time (sec): 25.10 - samples/sec: 2573.75 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-17 12:11:14,268 epoch 4 - iter 495/992 - loss 0.05314738 - time (sec): 31.30 - samples/sec: 2590.35 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-17 12:11:20,252 epoch 4 - iter 594/992 - loss 0.05290348 - time (sec): 37.28 - samples/sec: 2600.92 - lr: 0.000021 - momentum: 0.000000
125
+ 2023-10-17 12:11:26,215 epoch 4 - iter 693/992 - loss 0.05269183 - time (sec): 43.25 - samples/sec: 2627.62 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-17 12:11:32,343 epoch 4 - iter 792/992 - loss 0.05275766 - time (sec): 49.37 - samples/sec: 2642.76 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-17 12:11:38,671 epoch 4 - iter 891/992 - loss 0.05126323 - time (sec): 55.70 - samples/sec: 2648.76 - lr: 0.000020 - momentum: 0.000000
128
+ 2023-10-17 12:11:44,771 epoch 4 - iter 990/992 - loss 0.05192802 - time (sec): 61.80 - samples/sec: 2648.78 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-17 12:11:44,890 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 12:11:44,890 EPOCH 4 done: loss 0.0519 - lr: 0.000020
131
+ 2023-10-17 12:11:48,459 DEV : loss 0.14457282423973083 - f1-score (micro avg) 0.7541
132
+ 2023-10-17 12:11:48,482 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-17 12:11:54,656 epoch 5 - iter 99/992 - loss 0.04255254 - time (sec): 6.17 - samples/sec: 2746.18 - lr: 0.000020 - momentum: 0.000000
134
+ 2023-10-17 12:12:00,775 epoch 5 - iter 198/992 - loss 0.03721103 - time (sec): 12.29 - samples/sec: 2715.66 - lr: 0.000019 - momentum: 0.000000
135
+ 2023-10-17 12:12:06,602 epoch 5 - iter 297/992 - loss 0.03770170 - time (sec): 18.12 - samples/sec: 2739.09 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-17 12:12:12,537 epoch 5 - iter 396/992 - loss 0.04002274 - time (sec): 24.05 - samples/sec: 2729.65 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-17 12:12:18,393 epoch 5 - iter 495/992 - loss 0.04097590 - time (sec): 29.91 - samples/sec: 2731.15 - lr: 0.000018 - momentum: 0.000000
138
+ 2023-10-17 12:12:24,013 epoch 5 - iter 594/992 - loss 0.04084997 - time (sec): 35.53 - samples/sec: 2731.78 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-17 12:12:30,275 epoch 5 - iter 693/992 - loss 0.04184421 - time (sec): 41.79 - samples/sec: 2728.74 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-17 12:12:36,500 epoch 5 - iter 792/992 - loss 0.04039337 - time (sec): 48.02 - samples/sec: 2724.95 - lr: 0.000017 - momentum: 0.000000
141
+ 2023-10-17 12:12:42,422 epoch 5 - iter 891/992 - loss 0.04016748 - time (sec): 53.94 - samples/sec: 2723.89 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-17 12:12:48,687 epoch 5 - iter 990/992 - loss 0.04010609 - time (sec): 60.20 - samples/sec: 2718.44 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-17 12:12:48,815 ----------------------------------------------------------------------------------------------------
144
+ 2023-10-17 12:12:48,815 EPOCH 5 done: loss 0.0400 - lr: 0.000017
145
+ 2023-10-17 12:12:52,461 DEV : loss 0.1545393168926239 - f1-score (micro avg) 0.7457
146
+ 2023-10-17 12:12:52,486 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-17 12:12:58,240 epoch 6 - iter 99/992 - loss 0.03053762 - time (sec): 5.75 - samples/sec: 2808.37 - lr: 0.000016 - momentum: 0.000000
148
+ 2023-10-17 12:13:04,393 epoch 6 - iter 198/992 - loss 0.03077824 - time (sec): 11.91 - samples/sec: 2752.27 - lr: 0.000016 - momentum: 0.000000
149
+ 2023-10-17 12:13:10,152 epoch 6 - iter 297/992 - loss 0.03119674 - time (sec): 17.66 - samples/sec: 2751.63 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-17 12:13:16,163 epoch 6 - iter 396/992 - loss 0.03144636 - time (sec): 23.68 - samples/sec: 2771.61 - lr: 0.000015 - momentum: 0.000000
151
+ 2023-10-17 12:13:22,454 epoch 6 - iter 495/992 - loss 0.03099838 - time (sec): 29.97 - samples/sec: 2759.57 - lr: 0.000015 - momentum: 0.000000
152
+ 2023-10-17 12:13:28,307 epoch 6 - iter 594/992 - loss 0.03087116 - time (sec): 35.82 - samples/sec: 2772.51 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-17 12:13:34,052 epoch 6 - iter 693/992 - loss 0.03073382 - time (sec): 41.56 - samples/sec: 2759.70 - lr: 0.000014 - momentum: 0.000000
154
+ 2023-10-17 12:13:39,994 epoch 6 - iter 792/992 - loss 0.03046479 - time (sec): 47.51 - samples/sec: 2760.03 - lr: 0.000014 - momentum: 0.000000
155
+ 2023-10-17 12:13:46,007 epoch 6 - iter 891/992 - loss 0.03053390 - time (sec): 53.52 - samples/sec: 2751.13 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-17 12:13:52,001 epoch 6 - iter 990/992 - loss 0.03080662 - time (sec): 59.51 - samples/sec: 2750.77 - lr: 0.000013 - momentum: 0.000000
157
+ 2023-10-17 12:13:52,137 ----------------------------------------------------------------------------------------------------
158
+ 2023-10-17 12:13:52,138 EPOCH 6 done: loss 0.0308 - lr: 0.000013
159
+ 2023-10-17 12:13:55,802 DEV : loss 0.16906479001045227 - f1-score (micro avg) 0.7563
160
+ 2023-10-17 12:13:55,829 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-17 12:14:02,579 epoch 7 - iter 99/992 - loss 0.03633557 - time (sec): 6.75 - samples/sec: 2451.08 - lr: 0.000013 - momentum: 0.000000
162
+ 2023-10-17 12:14:08,575 epoch 7 - iter 198/992 - loss 0.02602492 - time (sec): 12.74 - samples/sec: 2612.37 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-17 12:14:14,831 epoch 7 - iter 297/992 - loss 0.02587295 - time (sec): 19.00 - samples/sec: 2642.57 - lr: 0.000012 - momentum: 0.000000
164
+ 2023-10-17 12:14:20,871 epoch 7 - iter 396/992 - loss 0.02406875 - time (sec): 25.04 - samples/sec: 2668.88 - lr: 0.000012 - momentum: 0.000000
165
+ 2023-10-17 12:14:26,876 epoch 7 - iter 495/992 - loss 0.02415969 - time (sec): 31.04 - samples/sec: 2694.00 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-17 12:14:32,581 epoch 7 - iter 594/992 - loss 0.02344326 - time (sec): 36.75 - samples/sec: 2705.12 - lr: 0.000011 - momentum: 0.000000
167
+ 2023-10-17 12:14:38,429 epoch 7 - iter 693/992 - loss 0.02349137 - time (sec): 42.60 - samples/sec: 2700.29 - lr: 0.000011 - momentum: 0.000000
168
+ 2023-10-17 12:14:44,666 epoch 7 - iter 792/992 - loss 0.02373935 - time (sec): 48.83 - samples/sec: 2687.24 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-17 12:14:50,524 epoch 7 - iter 891/992 - loss 0.02317591 - time (sec): 54.69 - samples/sec: 2685.02 - lr: 0.000010 - momentum: 0.000000
170
+ 2023-10-17 12:14:56,875 epoch 7 - iter 990/992 - loss 0.02379480 - time (sec): 61.04 - samples/sec: 2681.55 - lr: 0.000010 - momentum: 0.000000
171
+ 2023-10-17 12:14:56,989 ----------------------------------------------------------------------------------------------------
172
+ 2023-10-17 12:14:56,989 EPOCH 7 done: loss 0.0239 - lr: 0.000010
173
+ 2023-10-17 12:15:00,763 DEV : loss 0.19923286139965057 - f1-score (micro avg) 0.7584
174
+ 2023-10-17 12:15:00,789 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-17 12:15:07,085 epoch 8 - iter 99/992 - loss 0.01766621 - time (sec): 6.29 - samples/sec: 2638.94 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-17 12:15:13,503 epoch 8 - iter 198/992 - loss 0.01948794 - time (sec): 12.71 - samples/sec: 2649.48 - lr: 0.000009 - momentum: 0.000000
177
+ 2023-10-17 12:15:19,541 epoch 8 - iter 297/992 - loss 0.02060868 - time (sec): 18.75 - samples/sec: 2637.99 - lr: 0.000009 - momentum: 0.000000
178
+ 2023-10-17 12:15:25,392 epoch 8 - iter 396/992 - loss 0.02079230 - time (sec): 24.60 - samples/sec: 2664.26 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-17 12:15:31,525 epoch 8 - iter 495/992 - loss 0.01946266 - time (sec): 30.73 - samples/sec: 2681.18 - lr: 0.000008 - momentum: 0.000000
180
+ 2023-10-17 12:15:37,627 epoch 8 - iter 594/992 - loss 0.01905439 - time (sec): 36.84 - samples/sec: 2670.25 - lr: 0.000008 - momentum: 0.000000
181
+ 2023-10-17 12:15:43,462 epoch 8 - iter 693/992 - loss 0.01877632 - time (sec): 42.67 - samples/sec: 2664.42 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-17 12:15:49,740 epoch 8 - iter 792/992 - loss 0.01826040 - time (sec): 48.95 - samples/sec: 2673.89 - lr: 0.000007 - momentum: 0.000000
183
+ 2023-10-17 12:15:55,638 epoch 8 - iter 891/992 - loss 0.01771491 - time (sec): 54.85 - samples/sec: 2679.21 - lr: 0.000007 - momentum: 0.000000
184
+ 2023-10-17 12:16:01,869 epoch 8 - iter 990/992 - loss 0.01711630 - time (sec): 61.08 - samples/sec: 2679.87 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-17 12:16:01,999 ----------------------------------------------------------------------------------------------------
186
+ 2023-10-17 12:16:01,999 EPOCH 8 done: loss 0.0171 - lr: 0.000007
187
+ 2023-10-17 12:16:05,753 DEV : loss 0.22457100450992584 - f1-score (micro avg) 0.7595
188
+ 2023-10-17 12:16:05,780 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-17 12:16:11,813 epoch 9 - iter 99/992 - loss 0.00974185 - time (sec): 6.03 - samples/sec: 2591.16 - lr: 0.000006 - momentum: 0.000000
190
+ 2023-10-17 12:16:17,738 epoch 9 - iter 198/992 - loss 0.01453937 - time (sec): 11.96 - samples/sec: 2666.01 - lr: 0.000006 - momentum: 0.000000
191
+ 2023-10-17 12:16:24,024 epoch 9 - iter 297/992 - loss 0.01576870 - time (sec): 18.24 - samples/sec: 2645.28 - lr: 0.000006 - momentum: 0.000000
192
+ 2023-10-17 12:16:30,258 epoch 9 - iter 396/992 - loss 0.01526670 - time (sec): 24.48 - samples/sec: 2658.71 - lr: 0.000005 - momentum: 0.000000
193
+ 2023-10-17 12:16:36,310 epoch 9 - iter 495/992 - loss 0.01459426 - time (sec): 30.53 - samples/sec: 2658.40 - lr: 0.000005 - momentum: 0.000000
194
+ 2023-10-17 12:16:42,624 epoch 9 - iter 594/992 - loss 0.01510246 - time (sec): 36.84 - samples/sec: 2670.70 - lr: 0.000005 - momentum: 0.000000
195
+ 2023-10-17 12:16:48,677 epoch 9 - iter 693/992 - loss 0.01477998 - time (sec): 42.90 - samples/sec: 2684.08 - lr: 0.000004 - momentum: 0.000000
196
+ 2023-10-17 12:16:54,722 epoch 9 - iter 792/992 - loss 0.01392476 - time (sec): 48.94 - samples/sec: 2680.28 - lr: 0.000004 - momentum: 0.000000
197
+ 2023-10-17 12:17:00,939 epoch 9 - iter 891/992 - loss 0.01406745 - time (sec): 55.16 - samples/sec: 2677.23 - lr: 0.000004 - momentum: 0.000000
198
+ 2023-10-17 12:17:07,036 epoch 9 - iter 990/992 - loss 0.01370319 - time (sec): 61.25 - samples/sec: 2673.04 - lr: 0.000003 - momentum: 0.000000
199
+ 2023-10-17 12:17:07,145 ----------------------------------------------------------------------------------------------------
200
+ 2023-10-17 12:17:07,145 EPOCH 9 done: loss 0.0137 - lr: 0.000003
201
+ 2023-10-17 12:17:10,873 DEV : loss 0.2310420423746109 - f1-score (micro avg) 0.7613
202
+ 2023-10-17 12:17:10,897 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 12:17:17,103 epoch 10 - iter 99/992 - loss 0.01103320 - time (sec): 6.20 - samples/sec: 2698.89 - lr: 0.000003 - momentum: 0.000000
204
+ 2023-10-17 12:17:23,243 epoch 10 - iter 198/992 - loss 0.01140684 - time (sec): 12.34 - samples/sec: 2673.86 - lr: 0.000003 - momentum: 0.000000
205
+ 2023-10-17 12:17:29,396 epoch 10 - iter 297/992 - loss 0.01229238 - time (sec): 18.50 - samples/sec: 2699.56 - lr: 0.000002 - momentum: 0.000000
206
+ 2023-10-17 12:17:35,321 epoch 10 - iter 396/992 - loss 0.01136142 - time (sec): 24.42 - samples/sec: 2692.86 - lr: 0.000002 - momentum: 0.000000
207
+ 2023-10-17 12:17:41,585 epoch 10 - iter 495/992 - loss 0.01008313 - time (sec): 30.69 - samples/sec: 2704.21 - lr: 0.000002 - momentum: 0.000000
208
+ 2023-10-17 12:17:47,624 epoch 10 - iter 594/992 - loss 0.01014504 - time (sec): 36.72 - samples/sec: 2718.86 - lr: 0.000001 - momentum: 0.000000
209
+ 2023-10-17 12:17:53,556 epoch 10 - iter 693/992 - loss 0.01023189 - time (sec): 42.66 - samples/sec: 2728.14 - lr: 0.000001 - momentum: 0.000000
210
+ 2023-10-17 12:17:59,429 epoch 10 - iter 792/992 - loss 0.01009792 - time (sec): 48.53 - samples/sec: 2724.82 - lr: 0.000001 - momentum: 0.000000
211
+ 2023-10-17 12:18:05,512 epoch 10 - iter 891/992 - loss 0.01058406 - time (sec): 54.61 - samples/sec: 2706.44 - lr: 0.000000 - momentum: 0.000000
212
+ 2023-10-17 12:18:11,517 epoch 10 - iter 990/992 - loss 0.01066306 - time (sec): 60.62 - samples/sec: 2701.38 - lr: 0.000000 - momentum: 0.000000
213
+ 2023-10-17 12:18:11,627 ----------------------------------------------------------------------------------------------------
214
+ 2023-10-17 12:18:11,628 EPOCH 10 done: loss 0.0106 - lr: 0.000000
215
+ 2023-10-17 12:18:15,865 DEV : loss 0.23796458542346954 - f1-score (micro avg) 0.7672
216
+ 2023-10-17 12:18:16,334 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 12:18:16,336 Loading model from best epoch ...
218
+ 2023-10-17 12:18:17,913 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
219
+ 2023-10-17 12:18:21,757
220
+ Results:
221
+ - F-score (micro) 0.7771
222
+ - F-score (macro) 0.697
223
+ - Accuracy 0.6629
224
+
225
+ By class:
226
+ precision recall f1-score support
227
+
228
+ LOC 0.8358 0.8626 0.8490 655
229
+ PER 0.6897 0.8072 0.7438 223
230
+ ORG 0.4494 0.5591 0.4982 127
231
+
232
+ micro avg 0.7452 0.8119 0.7771 1005
233
+ macro avg 0.6583 0.7429 0.6970 1005
234
+ weighted avg 0.7545 0.8119 0.7813 1005
235
+
236
+ 2023-10-17 12:18:21,758 ----------------------------------------------------------------------------------------------------