stefan-it commited on
Commit
fefc258
·
1 Parent(s): e69e52c

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +237 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:454345c68d6db2b85ec82024b6f8ad4483653dd63fc197efc6d589f47e32e707
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 23:15:50 0.0000 0.2558 0.1092 0.5422 0.6465 0.5898 0.4239
3
+ 2 23:17:47 0.0000 0.0843 0.1188 0.5553 0.7815 0.6492 0.4886
4
+ 3 23:19:49 0.0000 0.0595 0.1852 0.5122 0.7437 0.6066 0.4443
5
+ 4 23:21:48 0.0000 0.0433 0.2180 0.5433 0.7254 0.6213 0.4591
6
+ 5 23:23:44 0.0000 0.0324 0.2920 0.5420 0.7757 0.6381 0.4785
7
+ 6 23:25:41 0.0000 0.0232 0.3259 0.5529 0.7300 0.6292 0.4660
8
+ 7 23:27:38 0.0000 0.0151 0.3449 0.5478 0.7677 0.6394 0.4789
9
+ 8 23:29:38 0.0000 0.0105 0.3785 0.5447 0.7872 0.6439 0.4831
10
+ 9 23:31:36 0.0000 0.0070 0.3912 0.5568 0.7632 0.6438 0.4830
11
+ 10 23:33:34 0.0000 0.0042 0.4097 0.5543 0.7654 0.6430 0.4823
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 23:13:54,062 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 23:13:54,063 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-14 23:13:54,063 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
52
+ - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
53
+ 2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-14 23:13:54,063 Train: 14465 sentences
55
+ 2023-10-14 23:13:54,063 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-14 23:13:54,063 Training Params:
58
+ 2023-10-14 23:13:54,063 - learning_rate: "5e-05"
59
+ 2023-10-14 23:13:54,063 - mini_batch_size: "8"
60
+ 2023-10-14 23:13:54,063 - max_epochs: "10"
61
+ 2023-10-14 23:13:54,063 - shuffle: "True"
62
+ 2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-14 23:13:54,063 Plugins:
64
+ 2023-10-14 23:13:54,063 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-14 23:13:54,063 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-14 23:13:54,063 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-14 23:13:54,063 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-14 23:13:54,063 Computation:
70
+ 2023-10-14 23:13:54,063 - compute on device: cuda:0
71
+ 2023-10-14 23:13:54,064 - embedding storage: none
72
+ 2023-10-14 23:13:54,064 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-14 23:13:54,064 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
74
+ 2023-10-14 23:13:54,064 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-14 23:13:54,064 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-14 23:14:05,117 epoch 1 - iter 180/1809 - loss 1.38557352 - time (sec): 11.05 - samples/sec: 3462.40 - lr: 0.000005 - momentum: 0.000000
77
+ 2023-10-14 23:14:16,159 epoch 1 - iter 360/1809 - loss 0.80804825 - time (sec): 22.09 - samples/sec: 3436.57 - lr: 0.000010 - momentum: 0.000000
78
+ 2023-10-14 23:14:27,057 epoch 1 - iter 540/1809 - loss 0.59504882 - time (sec): 32.99 - samples/sec: 3419.21 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-14 23:14:38,311 epoch 1 - iter 720/1809 - loss 0.47345095 - time (sec): 44.25 - samples/sec: 3442.23 - lr: 0.000020 - momentum: 0.000000
80
+ 2023-10-14 23:14:49,306 epoch 1 - iter 900/1809 - loss 0.40291276 - time (sec): 55.24 - samples/sec: 3439.24 - lr: 0.000025 - momentum: 0.000000
81
+ 2023-10-14 23:15:00,620 epoch 1 - iter 1080/1809 - loss 0.35417304 - time (sec): 66.56 - samples/sec: 3444.05 - lr: 0.000030 - momentum: 0.000000
82
+ 2023-10-14 23:15:11,503 epoch 1 - iter 1260/1809 - loss 0.32111486 - time (sec): 77.44 - samples/sec: 3438.10 - lr: 0.000035 - momentum: 0.000000
83
+ 2023-10-14 23:15:22,452 epoch 1 - iter 1440/1809 - loss 0.29407849 - time (sec): 88.39 - samples/sec: 3437.09 - lr: 0.000040 - momentum: 0.000000
84
+ 2023-10-14 23:15:33,541 epoch 1 - iter 1620/1809 - loss 0.27249387 - time (sec): 99.48 - samples/sec: 3427.40 - lr: 0.000045 - momentum: 0.000000
85
+ 2023-10-14 23:15:44,480 epoch 1 - iter 1800/1809 - loss 0.25650520 - time (sec): 110.42 - samples/sec: 3426.34 - lr: 0.000050 - momentum: 0.000000
86
+ 2023-10-14 23:15:44,991 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 23:15:44,991 EPOCH 1 done: loss 0.2558 - lr: 0.000050
88
+ 2023-10-14 23:15:50,308 DEV : loss 0.10922493785619736 - f1-score (micro avg) 0.5898
89
+ 2023-10-14 23:15:50,338 saving best model
90
+ 2023-10-14 23:15:50,724 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-14 23:16:01,639 epoch 2 - iter 180/1809 - loss 0.09115416 - time (sec): 10.91 - samples/sec: 3394.64 - lr: 0.000049 - momentum: 0.000000
92
+ 2023-10-14 23:16:12,680 epoch 2 - iter 360/1809 - loss 0.09214244 - time (sec): 21.95 - samples/sec: 3428.42 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-14 23:16:23,647 epoch 2 - iter 540/1809 - loss 0.08998102 - time (sec): 32.92 - samples/sec: 3449.64 - lr: 0.000048 - momentum: 0.000000
94
+ 2023-10-14 23:16:34,929 epoch 2 - iter 720/1809 - loss 0.08845021 - time (sec): 44.20 - samples/sec: 3448.58 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-14 23:16:45,932 epoch 2 - iter 900/1809 - loss 0.08692869 - time (sec): 55.21 - samples/sec: 3466.90 - lr: 0.000047 - momentum: 0.000000
96
+ 2023-10-14 23:16:57,081 epoch 2 - iter 1080/1809 - loss 0.08601996 - time (sec): 66.36 - samples/sec: 3452.03 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-14 23:17:08,054 epoch 2 - iter 1260/1809 - loss 0.08669678 - time (sec): 77.33 - samples/sec: 3443.25 - lr: 0.000046 - momentum: 0.000000
98
+ 2023-10-14 23:17:18,723 epoch 2 - iter 1440/1809 - loss 0.08604451 - time (sec): 88.00 - samples/sec: 3433.05 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-14 23:17:29,971 epoch 2 - iter 1620/1809 - loss 0.08524993 - time (sec): 99.25 - samples/sec: 3434.57 - lr: 0.000045 - momentum: 0.000000
100
+ 2023-10-14 23:17:40,783 epoch 2 - iter 1800/1809 - loss 0.08442574 - time (sec): 110.06 - samples/sec: 3435.34 - lr: 0.000044 - momentum: 0.000000
101
+ 2023-10-14 23:17:41,312 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-14 23:17:41,313 EPOCH 2 done: loss 0.0843 - lr: 0.000044
103
+ 2023-10-14 23:17:47,616 DEV : loss 0.11877861618995667 - f1-score (micro avg) 0.6492
104
+ 2023-10-14 23:17:47,664 saving best model
105
+ 2023-10-14 23:17:48,158 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-14 23:17:59,754 epoch 3 - iter 180/1809 - loss 0.05224098 - time (sec): 11.59 - samples/sec: 3334.61 - lr: 0.000044 - momentum: 0.000000
107
+ 2023-10-14 23:18:10,717 epoch 3 - iter 360/1809 - loss 0.05757660 - time (sec): 22.56 - samples/sec: 3391.79 - lr: 0.000043 - momentum: 0.000000
108
+ 2023-10-14 23:18:21,815 epoch 3 - iter 540/1809 - loss 0.05923171 - time (sec): 33.65 - samples/sec: 3374.86 - lr: 0.000043 - momentum: 0.000000
109
+ 2023-10-14 23:18:32,890 epoch 3 - iter 720/1809 - loss 0.05951510 - time (sec): 44.73 - samples/sec: 3404.49 - lr: 0.000042 - momentum: 0.000000
110
+ 2023-10-14 23:18:44,341 epoch 3 - iter 900/1809 - loss 0.05917685 - time (sec): 56.18 - samples/sec: 3376.67 - lr: 0.000042 - momentum: 0.000000
111
+ 2023-10-14 23:18:55,811 epoch 3 - iter 1080/1809 - loss 0.05994787 - time (sec): 67.65 - samples/sec: 3358.55 - lr: 0.000041 - momentum: 0.000000
112
+ 2023-10-14 23:19:07,534 epoch 3 - iter 1260/1809 - loss 0.05928422 - time (sec): 79.37 - samples/sec: 3339.69 - lr: 0.000041 - momentum: 0.000000
113
+ 2023-10-14 23:19:19,370 epoch 3 - iter 1440/1809 - loss 0.05933673 - time (sec): 91.21 - samples/sec: 3312.36 - lr: 0.000040 - momentum: 0.000000
114
+ 2023-10-14 23:19:30,948 epoch 3 - iter 1620/1809 - loss 0.06123094 - time (sec): 102.79 - samples/sec: 3312.93 - lr: 0.000039 - momentum: 0.000000
115
+ 2023-10-14 23:19:41,857 epoch 3 - iter 1800/1809 - loss 0.05963446 - time (sec): 113.70 - samples/sec: 3325.96 - lr: 0.000039 - momentum: 0.000000
116
+ 2023-10-14 23:19:42,476 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-14 23:19:42,476 EPOCH 3 done: loss 0.0595 - lr: 0.000039
118
+ 2023-10-14 23:19:49,864 DEV : loss 0.1852269172668457 - f1-score (micro avg) 0.6066
119
+ 2023-10-14 23:19:49,907 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-14 23:20:01,922 epoch 4 - iter 180/1809 - loss 0.03466777 - time (sec): 12.01 - samples/sec: 3240.89 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-14 23:20:12,957 epoch 4 - iter 360/1809 - loss 0.03740752 - time (sec): 23.05 - samples/sec: 3315.19 - lr: 0.000038 - momentum: 0.000000
122
+ 2023-10-14 23:20:24,052 epoch 4 - iter 540/1809 - loss 0.04243438 - time (sec): 34.14 - samples/sec: 3350.72 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-14 23:20:35,171 epoch 4 - iter 720/1809 - loss 0.04361553 - time (sec): 45.26 - samples/sec: 3339.38 - lr: 0.000037 - momentum: 0.000000
124
+ 2023-10-14 23:20:46,045 epoch 4 - iter 900/1809 - loss 0.04259376 - time (sec): 56.14 - samples/sec: 3357.05 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-14 23:20:57,326 epoch 4 - iter 1080/1809 - loss 0.04216192 - time (sec): 67.42 - samples/sec: 3361.99 - lr: 0.000036 - momentum: 0.000000
126
+ 2023-10-14 23:21:08,492 epoch 4 - iter 1260/1809 - loss 0.04211285 - time (sec): 78.58 - samples/sec: 3368.00 - lr: 0.000035 - momentum: 0.000000
127
+ 2023-10-14 23:21:19,539 epoch 4 - iter 1440/1809 - loss 0.04181257 - time (sec): 89.63 - samples/sec: 3382.30 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-14 23:21:30,326 epoch 4 - iter 1620/1809 - loss 0.04312238 - time (sec): 100.42 - samples/sec: 3393.06 - lr: 0.000034 - momentum: 0.000000
129
+ 2023-10-14 23:21:41,155 epoch 4 - iter 1800/1809 - loss 0.04326423 - time (sec): 111.25 - samples/sec: 3398.47 - lr: 0.000033 - momentum: 0.000000
130
+ 2023-10-14 23:21:41,677 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-14 23:21:41,677 EPOCH 4 done: loss 0.0433 - lr: 0.000033
132
+ 2023-10-14 23:21:48,355 DEV : loss 0.21803739666938782 - f1-score (micro avg) 0.6213
133
+ 2023-10-14 23:21:48,388 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-14 23:21:59,544 epoch 5 - iter 180/1809 - loss 0.03382032 - time (sec): 11.16 - samples/sec: 3385.98 - lr: 0.000033 - momentum: 0.000000
135
+ 2023-10-14 23:22:10,312 epoch 5 - iter 360/1809 - loss 0.02988533 - time (sec): 21.92 - samples/sec: 3446.91 - lr: 0.000032 - momentum: 0.000000
136
+ 2023-10-14 23:22:21,368 epoch 5 - iter 540/1809 - loss 0.02897812 - time (sec): 32.98 - samples/sec: 3436.13 - lr: 0.000032 - momentum: 0.000000
137
+ 2023-10-14 23:22:32,201 epoch 5 - iter 720/1809 - loss 0.02920811 - time (sec): 43.81 - samples/sec: 3422.99 - lr: 0.000031 - momentum: 0.000000
138
+ 2023-10-14 23:22:43,261 epoch 5 - iter 900/1809 - loss 0.03020770 - time (sec): 54.87 - samples/sec: 3422.06 - lr: 0.000031 - momentum: 0.000000
139
+ 2023-10-14 23:22:54,098 epoch 5 - iter 1080/1809 - loss 0.03043284 - time (sec): 65.71 - samples/sec: 3422.19 - lr: 0.000030 - momentum: 0.000000
140
+ 2023-10-14 23:23:04,857 epoch 5 - iter 1260/1809 - loss 0.03101516 - time (sec): 76.47 - samples/sec: 3423.04 - lr: 0.000029 - momentum: 0.000000
141
+ 2023-10-14 23:23:16,285 epoch 5 - iter 1440/1809 - loss 0.03128116 - time (sec): 87.90 - samples/sec: 3425.18 - lr: 0.000029 - momentum: 0.000000
142
+ 2023-10-14 23:23:27,369 epoch 5 - iter 1620/1809 - loss 0.03158724 - time (sec): 98.98 - samples/sec: 3432.98 - lr: 0.000028 - momentum: 0.000000
143
+ 2023-10-14 23:23:38,672 epoch 5 - iter 1800/1809 - loss 0.03222026 - time (sec): 110.28 - samples/sec: 3430.35 - lr: 0.000028 - momentum: 0.000000
144
+ 2023-10-14 23:23:39,199 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-14 23:23:39,199 EPOCH 5 done: loss 0.0324 - lr: 0.000028
146
+ 2023-10-14 23:23:44,781 DEV : loss 0.29204583168029785 - f1-score (micro avg) 0.6381
147
+ 2023-10-14 23:23:44,816 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-14 23:23:55,602 epoch 6 - iter 180/1809 - loss 0.01835359 - time (sec): 10.78 - samples/sec: 3331.85 - lr: 0.000027 - momentum: 0.000000
149
+ 2023-10-14 23:24:06,716 epoch 6 - iter 360/1809 - loss 0.02104928 - time (sec): 21.90 - samples/sec: 3383.12 - lr: 0.000027 - momentum: 0.000000
150
+ 2023-10-14 23:24:17,593 epoch 6 - iter 540/1809 - loss 0.02382736 - time (sec): 32.78 - samples/sec: 3387.15 - lr: 0.000026 - momentum: 0.000000
151
+ 2023-10-14 23:24:29,242 epoch 6 - iter 720/1809 - loss 0.02520287 - time (sec): 44.43 - samples/sec: 3348.00 - lr: 0.000026 - momentum: 0.000000
152
+ 2023-10-14 23:24:40,477 epoch 6 - iter 900/1809 - loss 0.02464033 - time (sec): 55.66 - samples/sec: 3367.84 - lr: 0.000025 - momentum: 0.000000
153
+ 2023-10-14 23:24:51,833 epoch 6 - iter 1080/1809 - loss 0.02490246 - time (sec): 67.02 - samples/sec: 3365.05 - lr: 0.000024 - momentum: 0.000000
154
+ 2023-10-14 23:25:02,971 epoch 6 - iter 1260/1809 - loss 0.02386734 - time (sec): 78.15 - samples/sec: 3385.64 - lr: 0.000024 - momentum: 0.000000
155
+ 2023-10-14 23:25:14,059 epoch 6 - iter 1440/1809 - loss 0.02384248 - time (sec): 89.24 - samples/sec: 3407.87 - lr: 0.000023 - momentum: 0.000000
156
+ 2023-10-14 23:25:24,684 epoch 6 - iter 1620/1809 - loss 0.02370424 - time (sec): 99.87 - samples/sec: 3405.70 - lr: 0.000023 - momentum: 0.000000
157
+ 2023-10-14 23:25:35,893 epoch 6 - iter 1800/1809 - loss 0.02323564 - time (sec): 111.08 - samples/sec: 3403.90 - lr: 0.000022 - momentum: 0.000000
158
+ 2023-10-14 23:25:36,460 ----------------------------------------------------------------------------------------------------
159
+ 2023-10-14 23:25:36,460 EPOCH 6 done: loss 0.0232 - lr: 0.000022
160
+ 2023-10-14 23:25:41,960 DEV : loss 0.325937956571579 - f1-score (micro avg) 0.6292
161
+ 2023-10-14 23:25:41,990 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-14 23:25:53,211 epoch 7 - iter 180/1809 - loss 0.01268428 - time (sec): 11.22 - samples/sec: 3309.54 - lr: 0.000022 - momentum: 0.000000
163
+ 2023-10-14 23:26:04,242 epoch 7 - iter 360/1809 - loss 0.01253572 - time (sec): 22.25 - samples/sec: 3434.72 - lr: 0.000021 - momentum: 0.000000
164
+ 2023-10-14 23:26:15,278 epoch 7 - iter 540/1809 - loss 0.01170739 - time (sec): 33.29 - samples/sec: 3474.17 - lr: 0.000021 - momentum: 0.000000
165
+ 2023-10-14 23:26:26,064 epoch 7 - iter 720/1809 - loss 0.01321558 - time (sec): 44.07 - samples/sec: 3453.48 - lr: 0.000020 - momentum: 0.000000
166
+ 2023-10-14 23:26:37,214 epoch 7 - iter 900/1809 - loss 0.01522502 - time (sec): 55.22 - samples/sec: 3442.44 - lr: 0.000019 - momentum: 0.000000
167
+ 2023-10-14 23:26:48,017 epoch 7 - iter 1080/1809 - loss 0.01503871 - time (sec): 66.03 - samples/sec: 3451.90 - lr: 0.000019 - momentum: 0.000000
168
+ 2023-10-14 23:26:58,958 epoch 7 - iter 1260/1809 - loss 0.01509298 - time (sec): 76.97 - samples/sec: 3464.90 - lr: 0.000018 - momentum: 0.000000
169
+ 2023-10-14 23:27:09,510 epoch 7 - iter 1440/1809 - loss 0.01494629 - time (sec): 87.52 - samples/sec: 3455.94 - lr: 0.000018 - momentum: 0.000000
170
+ 2023-10-14 23:27:21,399 epoch 7 - iter 1620/1809 - loss 0.01502665 - time (sec): 99.41 - samples/sec: 3423.29 - lr: 0.000017 - momentum: 0.000000
171
+ 2023-10-14 23:27:32,509 epoch 7 - iter 1800/1809 - loss 0.01515155 - time (sec): 110.52 - samples/sec: 3422.67 - lr: 0.000017 - momentum: 0.000000
172
+ 2023-10-14 23:27:33,040 ----------------------------------------------------------------------------------------------------
173
+ 2023-10-14 23:27:33,041 EPOCH 7 done: loss 0.0151 - lr: 0.000017
174
+ 2023-10-14 23:27:38,689 DEV : loss 0.34486714005470276 - f1-score (micro avg) 0.6394
175
+ 2023-10-14 23:27:38,721 ----------------------------------------------------------------------------------------------------
176
+ 2023-10-14 23:27:49,645 epoch 8 - iter 180/1809 - loss 0.00504407 - time (sec): 10.92 - samples/sec: 3368.16 - lr: 0.000016 - momentum: 0.000000
177
+ 2023-10-14 23:28:00,921 epoch 8 - iter 360/1809 - loss 0.00779998 - time (sec): 22.20 - samples/sec: 3393.34 - lr: 0.000016 - momentum: 0.000000
178
+ 2023-10-14 23:28:12,199 epoch 8 - iter 540/1809 - loss 0.00806594 - time (sec): 33.48 - samples/sec: 3389.11 - lr: 0.000015 - momentum: 0.000000
179
+ 2023-10-14 23:28:23,292 epoch 8 - iter 720/1809 - loss 0.00804521 - time (sec): 44.57 - samples/sec: 3387.81 - lr: 0.000014 - momentum: 0.000000
180
+ 2023-10-14 23:28:34,228 epoch 8 - iter 900/1809 - loss 0.00787755 - time (sec): 55.51 - samples/sec: 3387.75 - lr: 0.000014 - momentum: 0.000000
181
+ 2023-10-14 23:28:45,339 epoch 8 - iter 1080/1809 - loss 0.00837508 - time (sec): 66.62 - samples/sec: 3398.85 - lr: 0.000013 - momentum: 0.000000
182
+ 2023-10-14 23:28:56,214 epoch 8 - iter 1260/1809 - loss 0.00920763 - time (sec): 77.49 - samples/sec: 3418.01 - lr: 0.000013 - momentum: 0.000000
183
+ 2023-10-14 23:29:07,281 epoch 8 - iter 1440/1809 - loss 0.01023911 - time (sec): 88.56 - samples/sec: 3413.01 - lr: 0.000012 - momentum: 0.000000
184
+ 2023-10-14 23:29:18,361 epoch 8 - iter 1620/1809 - loss 0.01031827 - time (sec): 99.64 - samples/sec: 3406.77 - lr: 0.000012 - momentum: 0.000000
185
+ 2023-10-14 23:29:29,759 epoch 8 - iter 1800/1809 - loss 0.01052622 - time (sec): 111.04 - samples/sec: 3408.30 - lr: 0.000011 - momentum: 0.000000
186
+ 2023-10-14 23:29:30,243 ----------------------------------------------------------------------------------------------------
187
+ 2023-10-14 23:29:30,243 EPOCH 8 done: loss 0.0105 - lr: 0.000011
188
+ 2023-10-14 23:29:38,316 DEV : loss 0.37847810983657837 - f1-score (micro avg) 0.6439
189
+ 2023-10-14 23:29:38,350 ----------------------------------------------------------------------------------------------------
190
+ 2023-10-14 23:29:49,652 epoch 9 - iter 180/1809 - loss 0.00514140 - time (sec): 11.30 - samples/sec: 3379.90 - lr: 0.000011 - momentum: 0.000000
191
+ 2023-10-14 23:30:00,849 epoch 9 - iter 360/1809 - loss 0.00528666 - time (sec): 22.50 - samples/sec: 3363.85 - lr: 0.000010 - momentum: 0.000000
192
+ 2023-10-14 23:30:11,917 epoch 9 - iter 540/1809 - loss 0.00706337 - time (sec): 33.57 - samples/sec: 3376.04 - lr: 0.000009 - momentum: 0.000000
193
+ 2023-10-14 23:30:23,087 epoch 9 - iter 720/1809 - loss 0.00660810 - time (sec): 44.74 - samples/sec: 3397.13 - lr: 0.000009 - momentum: 0.000000
194
+ 2023-10-14 23:30:34,171 epoch 9 - iter 900/1809 - loss 0.00744791 - time (sec): 55.82 - samples/sec: 3402.85 - lr: 0.000008 - momentum: 0.000000
195
+ 2023-10-14 23:30:45,401 epoch 9 - iter 1080/1809 - loss 0.00728486 - time (sec): 67.05 - samples/sec: 3414.58 - lr: 0.000008 - momentum: 0.000000
196
+ 2023-10-14 23:30:56,449 epoch 9 - iter 1260/1809 - loss 0.00750422 - time (sec): 78.10 - samples/sec: 3417.96 - lr: 0.000007 - momentum: 0.000000
197
+ 2023-10-14 23:31:07,262 epoch 9 - iter 1440/1809 - loss 0.00739246 - time (sec): 88.91 - samples/sec: 3414.94 - lr: 0.000007 - momentum: 0.000000
198
+ 2023-10-14 23:31:18,285 epoch 9 - iter 1620/1809 - loss 0.00714565 - time (sec): 99.93 - samples/sec: 3420.75 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-14 23:31:28,970 epoch 9 - iter 1800/1809 - loss 0.00700843 - time (sec): 110.62 - samples/sec: 3418.39 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-14 23:31:29,513 ----------------------------------------------------------------------------------------------------
201
+ 2023-10-14 23:31:29,513 EPOCH 9 done: loss 0.0070 - lr: 0.000006
202
+ 2023-10-14 23:31:35,977 DEV : loss 0.39124542474746704 - f1-score (micro avg) 0.6438
203
+ 2023-10-14 23:31:36,014 ----------------------------------------------------------------------------------------------------
204
+ 2023-10-14 23:31:47,439 epoch 10 - iter 180/1809 - loss 0.00408599 - time (sec): 11.42 - samples/sec: 3336.10 - lr: 0.000005 - momentum: 0.000000
205
+ 2023-10-14 23:31:58,371 epoch 10 - iter 360/1809 - loss 0.00331461 - time (sec): 22.36 - samples/sec: 3400.13 - lr: 0.000004 - momentum: 0.000000
206
+ 2023-10-14 23:32:09,317 epoch 10 - iter 540/1809 - loss 0.00293145 - time (sec): 33.30 - samples/sec: 3393.56 - lr: 0.000004 - momentum: 0.000000
207
+ 2023-10-14 23:32:20,482 epoch 10 - iter 720/1809 - loss 0.00380863 - time (sec): 44.47 - samples/sec: 3405.89 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-14 23:32:31,512 epoch 10 - iter 900/1809 - loss 0.00352863 - time (sec): 55.50 - samples/sec: 3420.09 - lr: 0.000003 - momentum: 0.000000
209
+ 2023-10-14 23:32:42,367 epoch 10 - iter 1080/1809 - loss 0.00358662 - time (sec): 66.35 - samples/sec: 3425.15 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-14 23:32:53,206 epoch 10 - iter 1260/1809 - loss 0.00382641 - time (sec): 77.19 - samples/sec: 3431.10 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-14 23:33:04,379 epoch 10 - iter 1440/1809 - loss 0.00369667 - time (sec): 88.36 - samples/sec: 3435.24 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-14 23:33:15,225 epoch 10 - iter 1620/1809 - loss 0.00426803 - time (sec): 99.21 - samples/sec: 3416.99 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-14 23:33:26,612 epoch 10 - iter 1800/1809 - loss 0.00418167 - time (sec): 110.60 - samples/sec: 3421.02 - lr: 0.000000 - momentum: 0.000000
214
+ 2023-10-14 23:33:27,097 ----------------------------------------------------------------------------------------------------
215
+ 2023-10-14 23:33:27,097 EPOCH 10 done: loss 0.0042 - lr: 0.000000
216
+ 2023-10-14 23:33:34,503 DEV : loss 0.40973159670829773 - f1-score (micro avg) 0.643
217
+ 2023-10-14 23:33:34,924 ----------------------------------------------------------------------------------------------------
218
+ 2023-10-14 23:33:34,926 Loading model from best epoch ...
219
+ 2023-10-14 23:33:36,531 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
220
+ 2023-10-14 23:33:44,358
221
+ Results:
222
+ - F-score (micro) 0.6361
223
+ - F-score (macro) 0.4367
224
+ - Accuracy 0.4774
225
+
226
+ By class:
227
+ precision recall f1-score support
228
+
229
+ loc 0.6206 0.7750 0.6892 591
230
+ pers 0.5596 0.6975 0.6209 357
231
+ org 0.0000 0.0000 0.0000 79
232
+
233
+ micro avg 0.5911 0.6884 0.6361 1027
234
+ macro avg 0.3934 0.4908 0.4367 1027
235
+ weighted avg 0.5516 0.6884 0.6125 1027
236
+
237
+ 2023-10-14 23:33:44,359 ----------------------------------------------------------------------------------------------------