stefan-it's picture
Upload folder using huggingface_hub
1461c38
2023-10-17 16:18:17,473 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,475 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:18:17,475 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,475 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 16:18:17,475 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,475 Train: 3575 sentences
2023-10-17 16:18:17,475 (train_with_dev=False, train_with_test=False)
2023-10-17 16:18:17,475 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,475 Training Params:
2023-10-17 16:18:17,476 - learning_rate: "3e-05"
2023-10-17 16:18:17,476 - mini_batch_size: "4"
2023-10-17 16:18:17,476 - max_epochs: "10"
2023-10-17 16:18:17,476 - shuffle: "True"
2023-10-17 16:18:17,476 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,476 Plugins:
2023-10-17 16:18:17,476 - TensorboardLogger
2023-10-17 16:18:17,476 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:18:17,476 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,476 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:18:17,476 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:18:17,476 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,476 Computation:
2023-10-17 16:18:17,476 - compute on device: cuda:0
2023-10-17 16:18:17,476 - embedding storage: none
2023-10-17 16:18:17,477 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,477 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 16:18:17,477 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,477 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:17,477 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:18:24,576 epoch 1 - iter 89/894 - loss 3.49098594 - time (sec): 7.10 - samples/sec: 1258.54 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:18:31,595 epoch 1 - iter 178/894 - loss 2.32366876 - time (sec): 14.12 - samples/sec: 1228.67 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:18:38,525 epoch 1 - iter 267/894 - loss 1.75529411 - time (sec): 21.05 - samples/sec: 1189.14 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:18:45,733 epoch 1 - iter 356/894 - loss 1.39116171 - time (sec): 28.25 - samples/sec: 1225.66 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:18:52,748 epoch 1 - iter 445/894 - loss 1.18927583 - time (sec): 35.27 - samples/sec: 1221.71 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:18:59,818 epoch 1 - iter 534/894 - loss 1.04847785 - time (sec): 42.34 - samples/sec: 1221.03 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:19:06,763 epoch 1 - iter 623/894 - loss 0.94878418 - time (sec): 49.28 - samples/sec: 1209.92 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:19:13,830 epoch 1 - iter 712/894 - loss 0.86276987 - time (sec): 56.35 - samples/sec: 1215.14 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:19:21,649 epoch 1 - iter 801/894 - loss 0.79019396 - time (sec): 64.17 - samples/sec: 1213.93 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:19:29,090 epoch 1 - iter 890/894 - loss 0.73480438 - time (sec): 71.61 - samples/sec: 1204.46 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:19:29,405 ----------------------------------------------------------------------------------------------------
2023-10-17 16:19:29,405 EPOCH 1 done: loss 0.7328 - lr: 0.000030
2023-10-17 16:19:35,528 DEV : loss 0.15951120853424072 - f1-score (micro avg) 0.6309
2023-10-17 16:19:35,586 saving best model
2023-10-17 16:19:36,204 ----------------------------------------------------------------------------------------------------
2023-10-17 16:19:43,596 epoch 2 - iter 89/894 - loss 0.18396528 - time (sec): 7.39 - samples/sec: 1147.58 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:19:51,344 epoch 2 - iter 178/894 - loss 0.18794022 - time (sec): 15.14 - samples/sec: 1189.48 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:19:58,283 epoch 2 - iter 267/894 - loss 0.18661222 - time (sec): 22.08 - samples/sec: 1186.51 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:20:05,472 epoch 2 - iter 356/894 - loss 0.17708364 - time (sec): 29.26 - samples/sec: 1209.78 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:20:12,475 epoch 2 - iter 445/894 - loss 0.17032159 - time (sec): 36.27 - samples/sec: 1215.16 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:20:19,668 epoch 2 - iter 534/894 - loss 0.16365282 - time (sec): 43.46 - samples/sec: 1197.59 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:20:26,799 epoch 2 - iter 623/894 - loss 0.16022299 - time (sec): 50.59 - samples/sec: 1207.36 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:20:33,843 epoch 2 - iter 712/894 - loss 0.15724052 - time (sec): 57.64 - samples/sec: 1220.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:20:40,997 epoch 2 - iter 801/894 - loss 0.15532819 - time (sec): 64.79 - samples/sec: 1210.62 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:20:48,054 epoch 2 - iter 890/894 - loss 0.15430259 - time (sec): 71.85 - samples/sec: 1201.25 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:20:48,368 ----------------------------------------------------------------------------------------------------
2023-10-17 16:20:48,368 EPOCH 2 done: loss 0.1539 - lr: 0.000027
2023-10-17 16:20:59,492 DEV : loss 0.1809595227241516 - f1-score (micro avg) 0.7136
2023-10-17 16:20:59,549 saving best model
2023-10-17 16:21:00,966 ----------------------------------------------------------------------------------------------------
2023-10-17 16:21:08,129 epoch 3 - iter 89/894 - loss 0.09331837 - time (sec): 7.16 - samples/sec: 1124.91 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:21:15,444 epoch 3 - iter 178/894 - loss 0.08405150 - time (sec): 14.47 - samples/sec: 1159.98 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:21:23,055 epoch 3 - iter 267/894 - loss 0.07605411 - time (sec): 22.08 - samples/sec: 1167.85 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:21:30,524 epoch 3 - iter 356/894 - loss 0.07990159 - time (sec): 29.55 - samples/sec: 1152.16 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:21:37,586 epoch 3 - iter 445/894 - loss 0.07898967 - time (sec): 36.61 - samples/sec: 1180.95 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:21:44,567 epoch 3 - iter 534/894 - loss 0.08478162 - time (sec): 43.60 - samples/sec: 1180.53 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:21:51,598 epoch 3 - iter 623/894 - loss 0.08323998 - time (sec): 50.63 - samples/sec: 1194.00 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:21:58,523 epoch 3 - iter 712/894 - loss 0.08445594 - time (sec): 57.55 - samples/sec: 1198.00 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:22:05,473 epoch 3 - iter 801/894 - loss 0.08776340 - time (sec): 64.50 - samples/sec: 1199.14 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:22:12,539 epoch 3 - iter 890/894 - loss 0.08785018 - time (sec): 71.57 - samples/sec: 1204.47 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:22:12,841 ----------------------------------------------------------------------------------------------------
2023-10-17 16:22:12,841 EPOCH 3 done: loss 0.0879 - lr: 0.000023
2023-10-17 16:22:24,398 DEV : loss 0.17476926743984222 - f1-score (micro avg) 0.7514
2023-10-17 16:22:24,459 saving best model
2023-10-17 16:22:25,882 ----------------------------------------------------------------------------------------------------
2023-10-17 16:22:32,762 epoch 4 - iter 89/894 - loss 0.04063277 - time (sec): 6.88 - samples/sec: 1393.61 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:22:39,858 epoch 4 - iter 178/894 - loss 0.04513127 - time (sec): 13.97 - samples/sec: 1366.56 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:22:46,759 epoch 4 - iter 267/894 - loss 0.04728134 - time (sec): 20.87 - samples/sec: 1302.78 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:22:53,633 epoch 4 - iter 356/894 - loss 0.04603510 - time (sec): 27.75 - samples/sec: 1262.38 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:23:00,593 epoch 4 - iter 445/894 - loss 0.04807142 - time (sec): 34.71 - samples/sec: 1253.87 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:23:07,910 epoch 4 - iter 534/894 - loss 0.04880179 - time (sec): 42.02 - samples/sec: 1245.02 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:23:15,036 epoch 4 - iter 623/894 - loss 0.04793107 - time (sec): 49.15 - samples/sec: 1233.15 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:23:22,257 epoch 4 - iter 712/894 - loss 0.04983695 - time (sec): 56.37 - samples/sec: 1230.64 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:23:29,415 epoch 4 - iter 801/894 - loss 0.05080398 - time (sec): 63.53 - samples/sec: 1225.07 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:23:36,701 epoch 4 - iter 890/894 - loss 0.05295316 - time (sec): 70.81 - samples/sec: 1216.37 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:23:37,026 ----------------------------------------------------------------------------------------------------
2023-10-17 16:23:37,026 EPOCH 4 done: loss 0.0531 - lr: 0.000020
2023-10-17 16:23:48,561 DEV : loss 0.1792767345905304 - f1-score (micro avg) 0.784
2023-10-17 16:23:48,621 saving best model
2023-10-17 16:23:50,016 ----------------------------------------------------------------------------------------------------
2023-10-17 16:23:57,097 epoch 5 - iter 89/894 - loss 0.03003141 - time (sec): 7.08 - samples/sec: 1184.99 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:24:04,339 epoch 5 - iter 178/894 - loss 0.03386181 - time (sec): 14.32 - samples/sec: 1258.85 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:24:11,294 epoch 5 - iter 267/894 - loss 0.03450992 - time (sec): 21.27 - samples/sec: 1259.13 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:24:18,300 epoch 5 - iter 356/894 - loss 0.03581250 - time (sec): 28.28 - samples/sec: 1242.93 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:24:25,225 epoch 5 - iter 445/894 - loss 0.03969636 - time (sec): 35.21 - samples/sec: 1227.42 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:24:32,350 epoch 5 - iter 534/894 - loss 0.04027827 - time (sec): 42.33 - samples/sec: 1233.12 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:24:39,848 epoch 5 - iter 623/894 - loss 0.04050683 - time (sec): 49.83 - samples/sec: 1217.98 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:24:46,961 epoch 5 - iter 712/894 - loss 0.03821864 - time (sec): 56.94 - samples/sec: 1224.89 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:24:54,046 epoch 5 - iter 801/894 - loss 0.03790059 - time (sec): 64.03 - samples/sec: 1216.47 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:25:01,137 epoch 5 - iter 890/894 - loss 0.03627111 - time (sec): 71.12 - samples/sec: 1213.26 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:25:01,446 ----------------------------------------------------------------------------------------------------
2023-10-17 16:25:01,447 EPOCH 5 done: loss 0.0362 - lr: 0.000017
2023-10-17 16:25:13,187 DEV : loss 0.1935582309961319 - f1-score (micro avg) 0.7777
2023-10-17 16:25:13,243 ----------------------------------------------------------------------------------------------------
2023-10-17 16:25:20,221 epoch 6 - iter 89/894 - loss 0.03711638 - time (sec): 6.98 - samples/sec: 1275.39 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:25:27,241 epoch 6 - iter 178/894 - loss 0.02750927 - time (sec): 14.00 - samples/sec: 1256.60 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:25:33,829 epoch 6 - iter 267/894 - loss 0.02860102 - time (sec): 20.58 - samples/sec: 1256.66 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:25:40,054 epoch 6 - iter 356/894 - loss 0.03148660 - time (sec): 26.81 - samples/sec: 1277.67 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:25:46,934 epoch 6 - iter 445/894 - loss 0.02760601 - time (sec): 33.69 - samples/sec: 1276.12 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:25:54,136 epoch 6 - iter 534/894 - loss 0.02699126 - time (sec): 40.89 - samples/sec: 1251.24 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:26:01,914 epoch 6 - iter 623/894 - loss 0.02489164 - time (sec): 48.67 - samples/sec: 1215.02 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:26:10,004 epoch 6 - iter 712/894 - loss 0.02413339 - time (sec): 56.76 - samples/sec: 1201.89 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:26:17,118 epoch 6 - iter 801/894 - loss 0.02224205 - time (sec): 63.87 - samples/sec: 1199.72 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:26:24,447 epoch 6 - iter 890/894 - loss 0.02153523 - time (sec): 71.20 - samples/sec: 1210.61 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:26:24,756 ----------------------------------------------------------------------------------------------------
2023-10-17 16:26:24,757 EPOCH 6 done: loss 0.0214 - lr: 0.000013
2023-10-17 16:26:36,294 DEV : loss 0.21073873341083527 - f1-score (micro avg) 0.7834
2023-10-17 16:26:36,351 ----------------------------------------------------------------------------------------------------
2023-10-17 16:26:43,526 epoch 7 - iter 89/894 - loss 0.01344321 - time (sec): 7.17 - samples/sec: 1213.35 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:26:50,431 epoch 7 - iter 178/894 - loss 0.00956923 - time (sec): 14.08 - samples/sec: 1167.69 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:26:57,578 epoch 7 - iter 267/894 - loss 0.01112114 - time (sec): 21.22 - samples/sec: 1178.79 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:27:04,765 epoch 7 - iter 356/894 - loss 0.01197922 - time (sec): 28.41 - samples/sec: 1200.39 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:27:11,688 epoch 7 - iter 445/894 - loss 0.01450837 - time (sec): 35.33 - samples/sec: 1208.76 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:27:18,436 epoch 7 - iter 534/894 - loss 0.01471580 - time (sec): 42.08 - samples/sec: 1226.69 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:27:25,406 epoch 7 - iter 623/894 - loss 0.01479523 - time (sec): 49.05 - samples/sec: 1226.52 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:27:32,663 epoch 7 - iter 712/894 - loss 0.01445821 - time (sec): 56.31 - samples/sec: 1231.00 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:27:39,657 epoch 7 - iter 801/894 - loss 0.01556922 - time (sec): 63.30 - samples/sec: 1234.27 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:27:46,756 epoch 7 - iter 890/894 - loss 0.01512635 - time (sec): 70.40 - samples/sec: 1222.90 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:27:47,082 ----------------------------------------------------------------------------------------------------
2023-10-17 16:27:47,082 EPOCH 7 done: loss 0.0151 - lr: 0.000010
2023-10-17 16:27:58,328 DEV : loss 0.22550107538700104 - f1-score (micro avg) 0.7899
2023-10-17 16:27:58,394 saving best model
2023-10-17 16:27:59,799 ----------------------------------------------------------------------------------------------------
2023-10-17 16:28:06,882 epoch 8 - iter 89/894 - loss 0.00735787 - time (sec): 7.08 - samples/sec: 1243.80 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:28:13,927 epoch 8 - iter 178/894 - loss 0.01365307 - time (sec): 14.12 - samples/sec: 1205.85 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:28:21,045 epoch 8 - iter 267/894 - loss 0.01185970 - time (sec): 21.24 - samples/sec: 1194.91 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:28:28,164 epoch 8 - iter 356/894 - loss 0.01188072 - time (sec): 28.36 - samples/sec: 1184.91 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:28:35,573 epoch 8 - iter 445/894 - loss 0.01063050 - time (sec): 35.77 - samples/sec: 1219.19 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:28:43,036 epoch 8 - iter 534/894 - loss 0.01071909 - time (sec): 43.23 - samples/sec: 1214.11 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:28:50,128 epoch 8 - iter 623/894 - loss 0.00992864 - time (sec): 50.32 - samples/sec: 1224.89 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:28:57,943 epoch 8 - iter 712/894 - loss 0.00975064 - time (sec): 58.14 - samples/sec: 1198.00 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:29:05,257 epoch 8 - iter 801/894 - loss 0.01030314 - time (sec): 65.45 - samples/sec: 1192.10 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:29:12,309 epoch 8 - iter 890/894 - loss 0.01034406 - time (sec): 72.50 - samples/sec: 1190.39 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:29:12,613 ----------------------------------------------------------------------------------------------------
2023-10-17 16:29:12,613 EPOCH 8 done: loss 0.0103 - lr: 0.000007
2023-10-17 16:29:23,870 DEV : loss 0.2575424909591675 - f1-score (micro avg) 0.7861
2023-10-17 16:29:23,936 ----------------------------------------------------------------------------------------------------
2023-10-17 16:29:31,056 epoch 9 - iter 89/894 - loss 0.00456394 - time (sec): 7.12 - samples/sec: 1194.27 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:29:37,954 epoch 9 - iter 178/894 - loss 0.01007170 - time (sec): 14.02 - samples/sec: 1222.45 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:29:44,870 epoch 9 - iter 267/894 - loss 0.01043709 - time (sec): 20.93 - samples/sec: 1187.76 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:29:51,939 epoch 9 - iter 356/894 - loss 0.00821020 - time (sec): 28.00 - samples/sec: 1208.05 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:29:59,410 epoch 9 - iter 445/894 - loss 0.00750055 - time (sec): 35.47 - samples/sec: 1203.33 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:30:06,520 epoch 9 - iter 534/894 - loss 0.00728371 - time (sec): 42.58 - samples/sec: 1212.36 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:30:13,357 epoch 9 - iter 623/894 - loss 0.00665355 - time (sec): 49.42 - samples/sec: 1215.10 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:30:20,354 epoch 9 - iter 712/894 - loss 0.00712338 - time (sec): 56.41 - samples/sec: 1218.78 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:30:27,700 epoch 9 - iter 801/894 - loss 0.00673904 - time (sec): 63.76 - samples/sec: 1217.94 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:30:34,836 epoch 9 - iter 890/894 - loss 0.00673433 - time (sec): 70.90 - samples/sec: 1216.48 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:30:35,144 ----------------------------------------------------------------------------------------------------
2023-10-17 16:30:35,144 EPOCH 9 done: loss 0.0067 - lr: 0.000003
2023-10-17 16:30:46,951 DEV : loss 0.2568037509918213 - f1-score (micro avg) 0.7908
2023-10-17 16:30:47,017 saving best model
2023-10-17 16:30:48,440 ----------------------------------------------------------------------------------------------------
2023-10-17 16:30:55,777 epoch 10 - iter 89/894 - loss 0.00138104 - time (sec): 7.33 - samples/sec: 1254.10 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:31:02,774 epoch 10 - iter 178/894 - loss 0.00366289 - time (sec): 14.33 - samples/sec: 1208.69 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:31:09,867 epoch 10 - iter 267/894 - loss 0.00365606 - time (sec): 21.42 - samples/sec: 1185.24 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:31:16,830 epoch 10 - iter 356/894 - loss 0.00343159 - time (sec): 28.39 - samples/sec: 1198.61 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:31:23,966 epoch 10 - iter 445/894 - loss 0.00426874 - time (sec): 35.52 - samples/sec: 1202.62 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:31:31,313 epoch 10 - iter 534/894 - loss 0.00431619 - time (sec): 42.87 - samples/sec: 1218.53 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:31:38,177 epoch 10 - iter 623/894 - loss 0.00428653 - time (sec): 49.73 - samples/sec: 1204.91 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:31:45,304 epoch 10 - iter 712/894 - loss 0.00426761 - time (sec): 56.86 - samples/sec: 1208.58 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:31:52,251 epoch 10 - iter 801/894 - loss 0.00417570 - time (sec): 63.81 - samples/sec: 1205.29 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:31:59,363 epoch 10 - iter 890/894 - loss 0.00387118 - time (sec): 70.92 - samples/sec: 1214.02 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:31:59,677 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:59,677 EPOCH 10 done: loss 0.0039 - lr: 0.000000
2023-10-17 16:32:11,414 DEV : loss 0.24554787576198578 - f1-score (micro avg) 0.7947
2023-10-17 16:32:11,470 saving best model
2023-10-17 16:32:13,374 ----------------------------------------------------------------------------------------------------
2023-10-17 16:32:13,376 Loading model from best epoch ...
2023-10-17 16:32:15,657 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 16:32:21,664
Results:
- F-score (micro) 0.7665
- F-score (macro) 0.6658
- Accuracy 0.6401
By class:
precision recall f1-score support
loc 0.8478 0.8691 0.8583 596
pers 0.7116 0.8078 0.7567 333
org 0.5207 0.4773 0.4980 132
prod 0.5849 0.4697 0.5210 66
time 0.7174 0.6735 0.6947 49
micro avg 0.7560 0.7772 0.7665 1176
macro avg 0.6765 0.6595 0.6658 1176
weighted avg 0.7523 0.7772 0.7634 1176
2023-10-17 16:32:21,665 ----------------------------------------------------------------------------------------------------