stefan-it commited on
Commit
fc5b876
·
1 Parent(s): f3f2759

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +239 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:67cba33582e3a7352b931d84a2094a0c1b9499222725ad6c916b98773221a75b
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 00:35:06 0.0000 0.3439 0.1184 0.5774 0.6708 0.6206 0.4897
3
+ 2 00:36:41 0.0000 0.1127 0.1202 0.7098 0.7387 0.7239 0.5851
4
+ 3 00:38:17 0.0000 0.0817 0.1288 0.7525 0.7738 0.7630 0.6322
5
+ 4 00:39:52 0.0000 0.0614 0.1412 0.7128 0.7805 0.7451 0.6122
6
+ 5 00:41:26 0.0000 0.0461 0.1599 0.7374 0.7783 0.7573 0.6300
7
+ 6 00:43:00 0.0000 0.0345 0.1857 0.7374 0.7783 0.7573 0.6306
8
+ 7 00:44:32 0.0000 0.0238 0.1924 0.7373 0.7873 0.7615 0.6345
9
+ 8 00:46:07 0.0000 0.0171 0.2011 0.7571 0.7862 0.7714 0.6447
10
+ 9 00:47:41 0.0000 0.0128 0.2167 0.7576 0.7851 0.7711 0.6444
11
+ 10 00:49:15 0.0000 0.0078 0.2260 0.7452 0.7941 0.7689 0.6405
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 00:33:33,347 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 00:33:33,348 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-14 00:33:33,348 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-14 00:33:33,349 MultiCorpus: 7936 train + 992 dev + 992 test sentences
52
+ - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
53
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-14 00:33:33,349 Train: 7936 sentences
55
+ 2023-10-14 00:33:33,349 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-14 00:33:33,349 Training Params:
58
+ 2023-10-14 00:33:33,349 - learning_rate: "3e-05"
59
+ 2023-10-14 00:33:33,349 - mini_batch_size: "4"
60
+ 2023-10-14 00:33:33,349 - max_epochs: "10"
61
+ 2023-10-14 00:33:33,349 - shuffle: "True"
62
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-14 00:33:33,349 Plugins:
64
+ 2023-10-14 00:33:33,349 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-14 00:33:33,349 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-14 00:33:33,349 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-14 00:33:33,349 Computation:
70
+ 2023-10-14 00:33:33,349 - compute on device: cuda:0
71
+ 2023-10-14 00:33:33,349 - embedding storage: none
72
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-14 00:33:33,349 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
74
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-14 00:33:33,349 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-14 00:33:42,420 epoch 1 - iter 198/1984 - loss 1.84648731 - time (sec): 9.07 - samples/sec: 1702.75 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-14 00:33:51,452 epoch 1 - iter 396/1984 - loss 1.08861314 - time (sec): 18.10 - samples/sec: 1741.41 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-14 00:34:00,444 epoch 1 - iter 594/1984 - loss 0.79789327 - time (sec): 27.09 - samples/sec: 1775.55 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-14 00:34:09,335 epoch 1 - iter 792/1984 - loss 0.64340507 - time (sec): 35.98 - samples/sec: 1788.69 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-14 00:34:18,432 epoch 1 - iter 990/1984 - loss 0.55085242 - time (sec): 45.08 - samples/sec: 1792.80 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-14 00:34:27,429 epoch 1 - iter 1188/1984 - loss 0.48020788 - time (sec): 54.08 - samples/sec: 1809.66 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-14 00:34:36,495 epoch 1 - iter 1386/1984 - loss 0.43363784 - time (sec): 63.14 - samples/sec: 1801.24 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-14 00:34:45,534 epoch 1 - iter 1584/1984 - loss 0.39617398 - time (sec): 72.18 - samples/sec: 1801.87 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-14 00:34:54,461 epoch 1 - iter 1782/1984 - loss 0.36780931 - time (sec): 81.11 - samples/sec: 1803.46 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-14 00:35:03,459 epoch 1 - iter 1980/1984 - loss 0.34466223 - time (sec): 90.11 - samples/sec: 1813.44 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-14 00:35:03,679 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 00:35:03,679 EPOCH 1 done: loss 0.3439 - lr: 0.000030
88
+ 2023-10-14 00:35:06,904 DEV : loss 0.11840364336967468 - f1-score (micro avg) 0.6206
89
+ 2023-10-14 00:35:06,925 saving best model
90
+ 2023-10-14 00:35:07,303 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-14 00:35:16,413 epoch 2 - iter 198/1984 - loss 0.14075901 - time (sec): 9.11 - samples/sec: 1669.48 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-14 00:35:25,436 epoch 2 - iter 396/1984 - loss 0.12400082 - time (sec): 18.13 - samples/sec: 1720.40 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-14 00:35:34,763 epoch 2 - iter 594/1984 - loss 0.12219424 - time (sec): 27.46 - samples/sec: 1716.50 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-14 00:35:43,860 epoch 2 - iter 792/1984 - loss 0.11620520 - time (sec): 36.56 - samples/sec: 1741.22 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-14 00:35:52,817 epoch 2 - iter 990/1984 - loss 0.11433495 - time (sec): 45.51 - samples/sec: 1776.12 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-14 00:36:01,818 epoch 2 - iter 1188/1984 - loss 0.11548844 - time (sec): 54.51 - samples/sec: 1792.84 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-14 00:36:10,792 epoch 2 - iter 1386/1984 - loss 0.11442529 - time (sec): 63.49 - samples/sec: 1799.30 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-14 00:36:19,715 epoch 2 - iter 1584/1984 - loss 0.11261525 - time (sec): 72.41 - samples/sec: 1798.49 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-14 00:36:29,002 epoch 2 - iter 1782/1984 - loss 0.11148968 - time (sec): 81.70 - samples/sec: 1799.89 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-14 00:36:38,044 epoch 2 - iter 1980/1984 - loss 0.11272837 - time (sec): 90.74 - samples/sec: 1801.90 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-14 00:36:38,255 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-14 00:36:38,256 EPOCH 2 done: loss 0.1127 - lr: 0.000027
103
+ 2023-10-14 00:36:41,653 DEV : loss 0.12019851058721542 - f1-score (micro avg) 0.7239
104
+ 2023-10-14 00:36:41,675 saving best model
105
+ 2023-10-14 00:36:42,206 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-14 00:36:51,213 epoch 3 - iter 198/1984 - loss 0.06749262 - time (sec): 9.01 - samples/sec: 1676.79 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-14 00:37:00,237 epoch 3 - iter 396/1984 - loss 0.07556813 - time (sec): 18.03 - samples/sec: 1799.64 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-14 00:37:09,114 epoch 3 - iter 594/1984 - loss 0.07935076 - time (sec): 26.91 - samples/sec: 1779.72 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-14 00:37:18,151 epoch 3 - iter 792/1984 - loss 0.08102940 - time (sec): 35.94 - samples/sec: 1773.27 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-14 00:37:27,454 epoch 3 - iter 990/1984 - loss 0.07941764 - time (sec): 45.25 - samples/sec: 1800.13 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-14 00:37:36,997 epoch 3 - iter 1188/1984 - loss 0.08141394 - time (sec): 54.79 - samples/sec: 1785.56 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-14 00:37:46,204 epoch 3 - iter 1386/1984 - loss 0.08073553 - time (sec): 64.00 - samples/sec: 1790.81 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-14 00:37:55,490 epoch 3 - iter 1584/1984 - loss 0.08000645 - time (sec): 73.28 - samples/sec: 1792.85 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-14 00:38:04,457 epoch 3 - iter 1782/1984 - loss 0.08063126 - time (sec): 82.25 - samples/sec: 1786.95 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-14 00:38:13,357 epoch 3 - iter 1980/1984 - loss 0.08177117 - time (sec): 91.15 - samples/sec: 1794.98 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-14 00:38:13,537 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-14 00:38:13,537 EPOCH 3 done: loss 0.0817 - lr: 0.000023
118
+ 2023-10-14 00:38:17,007 DEV : loss 0.1288405805826187 - f1-score (micro avg) 0.763
119
+ 2023-10-14 00:38:17,032 saving best model
120
+ 2023-10-14 00:38:17,503 ----------------------------------------------------------------------------------------------------
121
+ 2023-10-14 00:38:26,719 epoch 4 - iter 198/1984 - loss 0.05060713 - time (sec): 9.21 - samples/sec: 1901.80 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-14 00:38:35,936 epoch 4 - iter 396/1984 - loss 0.05243662 - time (sec): 18.43 - samples/sec: 1833.07 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-14 00:38:44,927 epoch 4 - iter 594/1984 - loss 0.05486600 - time (sec): 27.42 - samples/sec: 1826.93 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-14 00:38:54,019 epoch 4 - iter 792/1984 - loss 0.05512558 - time (sec): 36.51 - samples/sec: 1809.34 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-14 00:39:03,272 epoch 4 - iter 990/1984 - loss 0.05454064 - time (sec): 45.76 - samples/sec: 1807.82 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-14 00:39:12,825 epoch 4 - iter 1188/1984 - loss 0.05627708 - time (sec): 55.32 - samples/sec: 1785.28 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-14 00:39:21,891 epoch 4 - iter 1386/1984 - loss 0.05694993 - time (sec): 64.38 - samples/sec: 1775.39 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-14 00:39:30,648 epoch 4 - iter 1584/1984 - loss 0.05757246 - time (sec): 73.14 - samples/sec: 1784.06 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-14 00:39:39,551 epoch 4 - iter 1782/1984 - loss 0.05843888 - time (sec): 82.04 - samples/sec: 1781.71 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-14 00:39:48,827 epoch 4 - iter 1980/1984 - loss 0.06142881 - time (sec): 91.32 - samples/sec: 1792.12 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-14 00:39:49,033 ----------------------------------------------------------------------------------------------------
132
+ 2023-10-14 00:39:49,033 EPOCH 4 done: loss 0.0614 - lr: 0.000020
133
+ 2023-10-14 00:39:52,419 DEV : loss 0.14116324484348297 - f1-score (micro avg) 0.7451
134
+ 2023-10-14 00:39:52,439 ----------------------------------------------------------------------------------------------------
135
+ 2023-10-14 00:40:01,484 epoch 5 - iter 198/1984 - loss 0.04525493 - time (sec): 9.04 - samples/sec: 1830.03 - lr: 0.000020 - momentum: 0.000000
136
+ 2023-10-14 00:40:10,539 epoch 5 - iter 396/1984 - loss 0.04542360 - time (sec): 18.10 - samples/sec: 1842.65 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-14 00:40:19,522 epoch 5 - iter 594/1984 - loss 0.04551483 - time (sec): 27.08 - samples/sec: 1820.33 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-14 00:40:28,471 epoch 5 - iter 792/1984 - loss 0.04325209 - time (sec): 36.03 - samples/sec: 1828.91 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-14 00:40:37,434 epoch 5 - iter 990/1984 - loss 0.04250521 - time (sec): 44.99 - samples/sec: 1829.61 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-14 00:40:46,472 epoch 5 - iter 1188/1984 - loss 0.04318675 - time (sec): 54.03 - samples/sec: 1830.16 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-14 00:40:55,443 epoch 5 - iter 1386/1984 - loss 0.04436125 - time (sec): 63.00 - samples/sec: 1812.70 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-14 00:41:04,539 epoch 5 - iter 1584/1984 - loss 0.04521410 - time (sec): 72.10 - samples/sec: 1820.00 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-14 00:41:13,633 epoch 5 - iter 1782/1984 - loss 0.04534807 - time (sec): 81.19 - samples/sec: 1813.67 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-14 00:41:22,673 epoch 5 - iter 1980/1984 - loss 0.04610443 - time (sec): 90.23 - samples/sec: 1814.23 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-14 00:41:22,851 ----------------------------------------------------------------------------------------------------
146
+ 2023-10-14 00:41:22,851 EPOCH 5 done: loss 0.0461 - lr: 0.000017
147
+ 2023-10-14 00:41:26,783 DEV : loss 0.15994729101657867 - f1-score (micro avg) 0.7573
148
+ 2023-10-14 00:41:26,804 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-14 00:41:36,094 epoch 6 - iter 198/1984 - loss 0.04012582 - time (sec): 9.29 - samples/sec: 1865.57 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-14 00:41:45,056 epoch 6 - iter 396/1984 - loss 0.03880028 - time (sec): 18.25 - samples/sec: 1830.40 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-14 00:41:54,099 epoch 6 - iter 594/1984 - loss 0.03591012 - time (sec): 27.29 - samples/sec: 1803.29 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-14 00:42:03,182 epoch 6 - iter 792/1984 - loss 0.03583443 - time (sec): 36.38 - samples/sec: 1810.54 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-14 00:42:12,153 epoch 6 - iter 990/1984 - loss 0.03618946 - time (sec): 45.35 - samples/sec: 1818.44 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-14 00:42:21,087 epoch 6 - iter 1188/1984 - loss 0.03540426 - time (sec): 54.28 - samples/sec: 1812.97 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-14 00:42:30,099 epoch 6 - iter 1386/1984 - loss 0.03591058 - time (sec): 63.29 - samples/sec: 1812.29 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-14 00:42:39,105 epoch 6 - iter 1584/1984 - loss 0.03527012 - time (sec): 72.30 - samples/sec: 1813.34 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-14 00:42:48,273 epoch 6 - iter 1782/1984 - loss 0.03466812 - time (sec): 81.47 - samples/sec: 1817.65 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-14 00:42:57,276 epoch 6 - iter 1980/1984 - loss 0.03459241 - time (sec): 90.47 - samples/sec: 1809.43 - lr: 0.000013 - momentum: 0.000000
159
+ 2023-10-14 00:42:57,461 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-14 00:42:57,461 EPOCH 6 done: loss 0.0345 - lr: 0.000013
161
+ 2023-10-14 00:43:00,862 DEV : loss 0.1856544464826584 - f1-score (micro avg) 0.7573
162
+ 2023-10-14 00:43:00,883 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-14 00:43:09,962 epoch 7 - iter 198/1984 - loss 0.02135452 - time (sec): 9.08 - samples/sec: 1781.01 - lr: 0.000013 - momentum: 0.000000
164
+ 2023-10-14 00:43:18,579 epoch 7 - iter 396/1984 - loss 0.02314302 - time (sec): 17.70 - samples/sec: 1814.63 - lr: 0.000013 - momentum: 0.000000
165
+ 2023-10-14 00:43:27,249 epoch 7 - iter 594/1984 - loss 0.02014483 - time (sec): 26.36 - samples/sec: 1861.63 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-14 00:43:35,932 epoch 7 - iter 792/1984 - loss 0.02217636 - time (sec): 35.05 - samples/sec: 1872.99 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-14 00:43:44,530 epoch 7 - iter 990/1984 - loss 0.02248853 - time (sec): 43.65 - samples/sec: 1870.13 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-14 00:43:53,208 epoch 7 - iter 1188/1984 - loss 0.02281695 - time (sec): 52.32 - samples/sec: 1878.85 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-14 00:44:01,962 epoch 7 - iter 1386/1984 - loss 0.02260346 - time (sec): 61.08 - samples/sec: 1883.39 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-14 00:44:10,588 epoch 7 - iter 1584/1984 - loss 0.02346248 - time (sec): 69.70 - samples/sec: 1881.17 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-14 00:44:19,772 epoch 7 - iter 1782/1984 - loss 0.02378911 - time (sec): 78.89 - samples/sec: 1869.24 - lr: 0.000010 - momentum: 0.000000
172
+ 2023-10-14 00:44:28,729 epoch 7 - iter 1980/1984 - loss 0.02373852 - time (sec): 87.84 - samples/sec: 1862.99 - lr: 0.000010 - momentum: 0.000000
173
+ 2023-10-14 00:44:28,901 ----------------------------------------------------------------------------------------------------
174
+ 2023-10-14 00:44:28,901 EPOCH 7 done: loss 0.0238 - lr: 0.000010
175
+ 2023-10-14 00:44:32,838 DEV : loss 0.19240804016590118 - f1-score (micro avg) 0.7615
176
+ 2023-10-14 00:44:32,859 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-14 00:44:41,845 epoch 8 - iter 198/1984 - loss 0.01575059 - time (sec): 8.98 - samples/sec: 1912.11 - lr: 0.000010 - momentum: 0.000000
178
+ 2023-10-14 00:44:50,888 epoch 8 - iter 396/1984 - loss 0.01422090 - time (sec): 18.03 - samples/sec: 1844.50 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-14 00:44:59,843 epoch 8 - iter 594/1984 - loss 0.01545793 - time (sec): 26.98 - samples/sec: 1809.45 - lr: 0.000009 - momentum: 0.000000
180
+ 2023-10-14 00:45:09,054 epoch 8 - iter 792/1984 - loss 0.01632885 - time (sec): 36.19 - samples/sec: 1826.74 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-14 00:45:18,131 epoch 8 - iter 990/1984 - loss 0.01657672 - time (sec): 45.27 - samples/sec: 1833.70 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-14 00:45:27,193 epoch 8 - iter 1188/1984 - loss 0.01688354 - time (sec): 54.33 - samples/sec: 1838.93 - lr: 0.000008 - momentum: 0.000000
183
+ 2023-10-14 00:45:36,076 epoch 8 - iter 1386/1984 - loss 0.01653230 - time (sec): 63.22 - samples/sec: 1837.14 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-14 00:45:45,263 epoch 8 - iter 1584/1984 - loss 0.01631411 - time (sec): 72.40 - samples/sec: 1821.74 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-14 00:45:54,512 epoch 8 - iter 1782/1984 - loss 0.01666752 - time (sec): 81.65 - samples/sec: 1811.34 - lr: 0.000007 - momentum: 0.000000
186
+ 2023-10-14 00:46:03,561 epoch 8 - iter 1980/1984 - loss 0.01708603 - time (sec): 90.70 - samples/sec: 1805.56 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-14 00:46:03,739 ----------------------------------------------------------------------------------------------------
188
+ 2023-10-14 00:46:03,739 EPOCH 8 done: loss 0.0171 - lr: 0.000007
189
+ 2023-10-14 00:46:07,144 DEV : loss 0.20113840699195862 - f1-score (micro avg) 0.7714
190
+ 2023-10-14 00:46:07,165 saving best model
191
+ 2023-10-14 00:46:07,687 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-14 00:46:16,719 epoch 9 - iter 198/1984 - loss 0.01121190 - time (sec): 9.03 - samples/sec: 1791.90 - lr: 0.000006 - momentum: 0.000000
193
+ 2023-10-14 00:46:25,702 epoch 9 - iter 396/1984 - loss 0.01335710 - time (sec): 18.01 - samples/sec: 1844.29 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-14 00:46:34,718 epoch 9 - iter 594/1984 - loss 0.01317916 - time (sec): 27.03 - samples/sec: 1851.45 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-14 00:46:43,597 epoch 9 - iter 792/1984 - loss 0.01271864 - time (sec): 35.91 - samples/sec: 1828.80 - lr: 0.000005 - momentum: 0.000000
196
+ 2023-10-14 00:46:52,551 epoch 9 - iter 990/1984 - loss 0.01201250 - time (sec): 44.86 - samples/sec: 1834.11 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-14 00:47:01,618 epoch 9 - iter 1188/1984 - loss 0.01186694 - time (sec): 53.93 - samples/sec: 1830.03 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-14 00:47:10,838 epoch 9 - iter 1386/1984 - loss 0.01295976 - time (sec): 63.15 - samples/sec: 1817.33 - lr: 0.000004 - momentum: 0.000000
199
+ 2023-10-14 00:47:19,883 epoch 9 - iter 1584/1984 - loss 0.01303216 - time (sec): 72.19 - samples/sec: 1825.07 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-14 00:47:28,749 epoch 9 - iter 1782/1984 - loss 0.01278500 - time (sec): 81.06 - samples/sec: 1821.39 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-14 00:47:37,683 epoch 9 - iter 1980/1984 - loss 0.01286673 - time (sec): 89.99 - samples/sec: 1817.30 - lr: 0.000003 - momentum: 0.000000
202
+ 2023-10-14 00:47:37,879 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-14 00:47:37,879 EPOCH 9 done: loss 0.0128 - lr: 0.000003
204
+ 2023-10-14 00:47:41,336 DEV : loss 0.2166709452867508 - f1-score (micro avg) 0.7711
205
+ 2023-10-14 00:47:41,357 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-14 00:47:50,410 epoch 10 - iter 198/1984 - loss 0.00699997 - time (sec): 9.05 - samples/sec: 1941.31 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-14 00:47:59,477 epoch 10 - iter 396/1984 - loss 0.00743539 - time (sec): 18.12 - samples/sec: 1881.47 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-14 00:48:08,429 epoch 10 - iter 594/1984 - loss 0.00701043 - time (sec): 27.07 - samples/sec: 1816.63 - lr: 0.000002 - momentum: 0.000000
209
+ 2023-10-14 00:48:17,379 epoch 10 - iter 792/1984 - loss 0.00691793 - time (sec): 36.02 - samples/sec: 1826.01 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-14 00:48:26,370 epoch 10 - iter 990/1984 - loss 0.00733336 - time (sec): 45.01 - samples/sec: 1829.49 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-14 00:48:35,376 epoch 10 - iter 1188/1984 - loss 0.00787838 - time (sec): 54.02 - samples/sec: 1821.42 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-14 00:48:44,332 epoch 10 - iter 1386/1984 - loss 0.00801026 - time (sec): 62.97 - samples/sec: 1820.26 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-14 00:48:53,392 epoch 10 - iter 1584/1984 - loss 0.00795308 - time (sec): 72.03 - samples/sec: 1822.69 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-14 00:49:02,320 epoch 10 - iter 1782/1984 - loss 0.00774112 - time (sec): 80.96 - samples/sec: 1826.42 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-14 00:49:11,494 epoch 10 - iter 1980/1984 - loss 0.00777068 - time (sec): 90.14 - samples/sec: 1816.02 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-14 00:49:11,673 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-14 00:49:11,673 EPOCH 10 done: loss 0.0078 - lr: 0.000000
218
+ 2023-10-14 00:49:15,510 DEV : loss 0.22600804269313812 - f1-score (micro avg) 0.7689
219
+ 2023-10-14 00:49:15,955 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-14 00:49:15,956 Loading model from best epoch ...
221
+ 2023-10-14 00:49:17,330 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
222
+ 2023-10-14 00:49:20,598
223
+ Results:
224
+ - F-score (micro) 0.7769
225
+ - F-score (macro) 0.6893
226
+ - Accuracy 0.6567
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ LOC 0.8162 0.8473 0.8315 655
232
+ PER 0.7083 0.8386 0.7680 223
233
+ ORG 0.5474 0.4094 0.4685 127
234
+
235
+ micro avg 0.7642 0.7900 0.7769 1005
236
+ macro avg 0.6906 0.6984 0.6893 1005
237
+ weighted avg 0.7583 0.7900 0.7715 1005
238
+
239
+ 2023-10-14 00:49:20,598 ----------------------------------------------------------------------------------------------------