2023-10-17 16:18:17,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,475 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:18:17,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,475 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-17 16:18:17,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,475 Train: 3575 sentences 2023-10-17 16:18:17,475 (train_with_dev=False, train_with_test=False) 2023-10-17 16:18:17,475 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,475 Training Params: 2023-10-17 16:18:17,476 - learning_rate: "3e-05" 2023-10-17 16:18:17,476 - mini_batch_size: "4" 2023-10-17 16:18:17,476 - max_epochs: "10" 2023-10-17 16:18:17,476 - shuffle: "True" 2023-10-17 16:18:17,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,476 Plugins: 2023-10-17 16:18:17,476 - TensorboardLogger 2023-10-17 16:18:17,476 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:18:17,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,476 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:18:17,476 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:18:17,476 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,476 Computation: 2023-10-17 16:18:17,476 - compute on device: cuda:0 2023-10-17 16:18:17,476 - embedding storage: none 2023-10-17 16:18:17,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,477 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 16:18:17,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:18:17,477 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:18:24,576 epoch 1 - iter 89/894 - loss 3.49098594 - time (sec): 7.10 - samples/sec: 1258.54 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:18:31,595 epoch 1 - iter 178/894 - loss 2.32366876 - time (sec): 14.12 - samples/sec: 1228.67 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:18:38,525 epoch 1 - iter 267/894 - loss 1.75529411 - time (sec): 21.05 - samples/sec: 1189.14 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:18:45,733 epoch 1 - iter 356/894 - loss 1.39116171 - time (sec): 28.25 - samples/sec: 1225.66 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:18:52,748 epoch 1 - iter 445/894 - loss 1.18927583 - time (sec): 35.27 - samples/sec: 1221.71 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:18:59,818 epoch 1 - iter 534/894 - loss 1.04847785 - time (sec): 42.34 - samples/sec: 1221.03 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:19:06,763 epoch 1 - iter 623/894 - loss 0.94878418 - time (sec): 49.28 - samples/sec: 1209.92 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:19:13,830 epoch 1 - iter 712/894 - loss 0.86276987 - time (sec): 56.35 - samples/sec: 1215.14 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:19:21,649 epoch 1 - iter 801/894 - loss 0.79019396 - time (sec): 64.17 - samples/sec: 1213.93 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:19:29,090 epoch 1 - iter 890/894 - loss 0.73480438 - time (sec): 71.61 - samples/sec: 1204.46 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:19:29,405 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:19:29,405 EPOCH 1 done: loss 0.7328 - lr: 0.000030 2023-10-17 16:19:35,528 DEV : loss 0.15951120853424072 - f1-score (micro avg) 0.6309 2023-10-17 16:19:35,586 saving best model 2023-10-17 16:19:36,204 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:19:43,596 epoch 2 - iter 89/894 - loss 0.18396528 - time (sec): 7.39 - samples/sec: 1147.58 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:19:51,344 epoch 2 - iter 178/894 - loss 0.18794022 - time (sec): 15.14 - samples/sec: 1189.48 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:19:58,283 epoch 2 - iter 267/894 - loss 0.18661222 - time (sec): 22.08 - samples/sec: 1186.51 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:20:05,472 epoch 2 - iter 356/894 - loss 0.17708364 - time (sec): 29.26 - samples/sec: 1209.78 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:20:12,475 epoch 2 - iter 445/894 - loss 0.17032159 - time (sec): 36.27 - samples/sec: 1215.16 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:20:19,668 epoch 2 - iter 534/894 - loss 0.16365282 - time (sec): 43.46 - samples/sec: 1197.59 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:20:26,799 epoch 2 - iter 623/894 - loss 0.16022299 - time (sec): 50.59 - samples/sec: 1207.36 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:20:33,843 epoch 2 - iter 712/894 - loss 0.15724052 - time (sec): 57.64 - samples/sec: 1220.13 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:20:40,997 epoch 2 - iter 801/894 - loss 0.15532819 - time (sec): 64.79 - samples/sec: 1210.62 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:20:48,054 epoch 2 - iter 890/894 - loss 0.15430259 - time (sec): 71.85 - samples/sec: 1201.25 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:20:48,368 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:20:48,368 EPOCH 2 done: loss 0.1539 - lr: 0.000027 2023-10-17 16:20:59,492 DEV : loss 0.1809595227241516 - f1-score (micro avg) 0.7136 2023-10-17 16:20:59,549 saving best model 2023-10-17 16:21:00,966 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:21:08,129 epoch 3 - iter 89/894 - loss 0.09331837 - time (sec): 7.16 - samples/sec: 1124.91 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:21:15,444 epoch 3 - iter 178/894 - loss 0.08405150 - time (sec): 14.47 - samples/sec: 1159.98 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:21:23,055 epoch 3 - iter 267/894 - loss 0.07605411 - time (sec): 22.08 - samples/sec: 1167.85 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:21:30,524 epoch 3 - iter 356/894 - loss 0.07990159 - time (sec): 29.55 - samples/sec: 1152.16 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:21:37,586 epoch 3 - iter 445/894 - loss 0.07898967 - time (sec): 36.61 - samples/sec: 1180.95 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:21:44,567 epoch 3 - iter 534/894 - loss 0.08478162 - time (sec): 43.60 - samples/sec: 1180.53 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:21:51,598 epoch 3 - iter 623/894 - loss 0.08323998 - time (sec): 50.63 - samples/sec: 1194.00 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:21:58,523 epoch 3 - iter 712/894 - loss 0.08445594 - time (sec): 57.55 - samples/sec: 1198.00 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:22:05,473 epoch 3 - iter 801/894 - loss 0.08776340 - time (sec): 64.50 - samples/sec: 1199.14 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:22:12,539 epoch 3 - iter 890/894 - loss 0.08785018 - time (sec): 71.57 - samples/sec: 1204.47 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:22:12,841 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:22:12,841 EPOCH 3 done: loss 0.0879 - lr: 0.000023 2023-10-17 16:22:24,398 DEV : loss 0.17476926743984222 - f1-score (micro avg) 0.7514 2023-10-17 16:22:24,459 saving best model 2023-10-17 16:22:25,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:22:32,762 epoch 4 - iter 89/894 - loss 0.04063277 - time (sec): 6.88 - samples/sec: 1393.61 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:22:39,858 epoch 4 - iter 178/894 - loss 0.04513127 - time (sec): 13.97 - samples/sec: 1366.56 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:22:46,759 epoch 4 - iter 267/894 - loss 0.04728134 - time (sec): 20.87 - samples/sec: 1302.78 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:22:53,633 epoch 4 - iter 356/894 - loss 0.04603510 - time (sec): 27.75 - samples/sec: 1262.38 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:23:00,593 epoch 4 - iter 445/894 - loss 0.04807142 - time (sec): 34.71 - samples/sec: 1253.87 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:23:07,910 epoch 4 - iter 534/894 - loss 0.04880179 - time (sec): 42.02 - samples/sec: 1245.02 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:23:15,036 epoch 4 - iter 623/894 - loss 0.04793107 - time (sec): 49.15 - samples/sec: 1233.15 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:23:22,257 epoch 4 - iter 712/894 - loss 0.04983695 - time (sec): 56.37 - samples/sec: 1230.64 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:23:29,415 epoch 4 - iter 801/894 - loss 0.05080398 - time (sec): 63.53 - samples/sec: 1225.07 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:23:36,701 epoch 4 - iter 890/894 - loss 0.05295316 - time (sec): 70.81 - samples/sec: 1216.37 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:23:37,026 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:23:37,026 EPOCH 4 done: loss 0.0531 - lr: 0.000020 2023-10-17 16:23:48,561 DEV : loss 0.1792767345905304 - f1-score (micro avg) 0.784 2023-10-17 16:23:48,621 saving best model 2023-10-17 16:23:50,016 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:23:57,097 epoch 5 - iter 89/894 - loss 0.03003141 - time (sec): 7.08 - samples/sec: 1184.99 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:24:04,339 epoch 5 - iter 178/894 - loss 0.03386181 - time (sec): 14.32 - samples/sec: 1258.85 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:24:11,294 epoch 5 - iter 267/894 - loss 0.03450992 - time (sec): 21.27 - samples/sec: 1259.13 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:24:18,300 epoch 5 - iter 356/894 - loss 0.03581250 - time (sec): 28.28 - samples/sec: 1242.93 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:24:25,225 epoch 5 - iter 445/894 - loss 0.03969636 - time (sec): 35.21 - samples/sec: 1227.42 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:24:32,350 epoch 5 - iter 534/894 - loss 0.04027827 - time (sec): 42.33 - samples/sec: 1233.12 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:24:39,848 epoch 5 - iter 623/894 - loss 0.04050683 - time (sec): 49.83 - samples/sec: 1217.98 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:24:46,961 epoch 5 - iter 712/894 - loss 0.03821864 - time (sec): 56.94 - samples/sec: 1224.89 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:24:54,046 epoch 5 - iter 801/894 - loss 0.03790059 - time (sec): 64.03 - samples/sec: 1216.47 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:25:01,137 epoch 5 - iter 890/894 - loss 0.03627111 - time (sec): 71.12 - samples/sec: 1213.26 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:25:01,446 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:25:01,447 EPOCH 5 done: loss 0.0362 - lr: 0.000017 2023-10-17 16:25:13,187 DEV : loss 0.1935582309961319 - f1-score (micro avg) 0.7777 2023-10-17 16:25:13,243 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:25:20,221 epoch 6 - iter 89/894 - loss 0.03711638 - time (sec): 6.98 - samples/sec: 1275.39 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:25:27,241 epoch 6 - iter 178/894 - loss 0.02750927 - time (sec): 14.00 - samples/sec: 1256.60 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:25:33,829 epoch 6 - iter 267/894 - loss 0.02860102 - time (sec): 20.58 - samples/sec: 1256.66 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:25:40,054 epoch 6 - iter 356/894 - loss 0.03148660 - time (sec): 26.81 - samples/sec: 1277.67 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:25:46,934 epoch 6 - iter 445/894 - loss 0.02760601 - time (sec): 33.69 - samples/sec: 1276.12 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:25:54,136 epoch 6 - iter 534/894 - loss 0.02699126 - time (sec): 40.89 - samples/sec: 1251.24 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:26:01,914 epoch 6 - iter 623/894 - loss 0.02489164 - time (sec): 48.67 - samples/sec: 1215.02 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:26:10,004 epoch 6 - iter 712/894 - loss 0.02413339 - time (sec): 56.76 - samples/sec: 1201.89 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:26:17,118 epoch 6 - iter 801/894 - loss 0.02224205 - time (sec): 63.87 - samples/sec: 1199.72 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:26:24,447 epoch 6 - iter 890/894 - loss 0.02153523 - time (sec): 71.20 - samples/sec: 1210.61 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:26:24,756 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:26:24,757 EPOCH 6 done: loss 0.0214 - lr: 0.000013 2023-10-17 16:26:36,294 DEV : loss 0.21073873341083527 - f1-score (micro avg) 0.7834 2023-10-17 16:26:36,351 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:26:43,526 epoch 7 - iter 89/894 - loss 0.01344321 - time (sec): 7.17 - samples/sec: 1213.35 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:26:50,431 epoch 7 - iter 178/894 - loss 0.00956923 - time (sec): 14.08 - samples/sec: 1167.69 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:26:57,578 epoch 7 - iter 267/894 - loss 0.01112114 - time (sec): 21.22 - samples/sec: 1178.79 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:27:04,765 epoch 7 - iter 356/894 - loss 0.01197922 - time (sec): 28.41 - samples/sec: 1200.39 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:27:11,688 epoch 7 - iter 445/894 - loss 0.01450837 - time (sec): 35.33 - samples/sec: 1208.76 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:27:18,436 epoch 7 - iter 534/894 - loss 0.01471580 - time (sec): 42.08 - samples/sec: 1226.69 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:27:25,406 epoch 7 - iter 623/894 - loss 0.01479523 - time (sec): 49.05 - samples/sec: 1226.52 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:27:32,663 epoch 7 - iter 712/894 - loss 0.01445821 - time (sec): 56.31 - samples/sec: 1231.00 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:27:39,657 epoch 7 - iter 801/894 - loss 0.01556922 - time (sec): 63.30 - samples/sec: 1234.27 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:27:46,756 epoch 7 - iter 890/894 - loss 0.01512635 - time (sec): 70.40 - samples/sec: 1222.90 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:27:47,082 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:27:47,082 EPOCH 7 done: loss 0.0151 - lr: 0.000010 2023-10-17 16:27:58,328 DEV : loss 0.22550107538700104 - f1-score (micro avg) 0.7899 2023-10-17 16:27:58,394 saving best model 2023-10-17 16:27:59,799 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:28:06,882 epoch 8 - iter 89/894 - loss 0.00735787 - time (sec): 7.08 - samples/sec: 1243.80 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:28:13,927 epoch 8 - iter 178/894 - loss 0.01365307 - time (sec): 14.12 - samples/sec: 1205.85 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:28:21,045 epoch 8 - iter 267/894 - loss 0.01185970 - time (sec): 21.24 - samples/sec: 1194.91 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:28:28,164 epoch 8 - iter 356/894 - loss 0.01188072 - time (sec): 28.36 - samples/sec: 1184.91 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:28:35,573 epoch 8 - iter 445/894 - loss 0.01063050 - time (sec): 35.77 - samples/sec: 1219.19 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:28:43,036 epoch 8 - iter 534/894 - loss 0.01071909 - time (sec): 43.23 - samples/sec: 1214.11 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:28:50,128 epoch 8 - iter 623/894 - loss 0.00992864 - time (sec): 50.32 - samples/sec: 1224.89 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:28:57,943 epoch 8 - iter 712/894 - loss 0.00975064 - time (sec): 58.14 - samples/sec: 1198.00 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:29:05,257 epoch 8 - iter 801/894 - loss 0.01030314 - time (sec): 65.45 - samples/sec: 1192.10 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:29:12,309 epoch 8 - iter 890/894 - loss 0.01034406 - time (sec): 72.50 - samples/sec: 1190.39 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:29:12,613 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:29:12,613 EPOCH 8 done: loss 0.0103 - lr: 0.000007 2023-10-17 16:29:23,870 DEV : loss 0.2575424909591675 - f1-score (micro avg) 0.7861 2023-10-17 16:29:23,936 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:29:31,056 epoch 9 - iter 89/894 - loss 0.00456394 - time (sec): 7.12 - samples/sec: 1194.27 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:29:37,954 epoch 9 - iter 178/894 - loss 0.01007170 - time (sec): 14.02 - samples/sec: 1222.45 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:29:44,870 epoch 9 - iter 267/894 - loss 0.01043709 - time (sec): 20.93 - samples/sec: 1187.76 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:29:51,939 epoch 9 - iter 356/894 - loss 0.00821020 - time (sec): 28.00 - samples/sec: 1208.05 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:29:59,410 epoch 9 - iter 445/894 - loss 0.00750055 - time (sec): 35.47 - samples/sec: 1203.33 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:30:06,520 epoch 9 - iter 534/894 - loss 0.00728371 - time (sec): 42.58 - samples/sec: 1212.36 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:30:13,357 epoch 9 - iter 623/894 - loss 0.00665355 - time (sec): 49.42 - samples/sec: 1215.10 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:30:20,354 epoch 9 - iter 712/894 - loss 0.00712338 - time (sec): 56.41 - samples/sec: 1218.78 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:30:27,700 epoch 9 - iter 801/894 - loss 0.00673904 - time (sec): 63.76 - samples/sec: 1217.94 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:30:34,836 epoch 9 - iter 890/894 - loss 0.00673433 - time (sec): 70.90 - samples/sec: 1216.48 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:30:35,144 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:30:35,144 EPOCH 9 done: loss 0.0067 - lr: 0.000003 2023-10-17 16:30:46,951 DEV : loss 0.2568037509918213 - f1-score (micro avg) 0.7908 2023-10-17 16:30:47,017 saving best model 2023-10-17 16:30:48,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:30:55,777 epoch 10 - iter 89/894 - loss 0.00138104 - time (sec): 7.33 - samples/sec: 1254.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:31:02,774 epoch 10 - iter 178/894 - loss 0.00366289 - time (sec): 14.33 - samples/sec: 1208.69 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:31:09,867 epoch 10 - iter 267/894 - loss 0.00365606 - time (sec): 21.42 - samples/sec: 1185.24 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:31:16,830 epoch 10 - iter 356/894 - loss 0.00343159 - time (sec): 28.39 - samples/sec: 1198.61 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:31:23,966 epoch 10 - iter 445/894 - loss 0.00426874 - time (sec): 35.52 - samples/sec: 1202.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:31:31,313 epoch 10 - iter 534/894 - loss 0.00431619 - time (sec): 42.87 - samples/sec: 1218.53 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:31:38,177 epoch 10 - iter 623/894 - loss 0.00428653 - time (sec): 49.73 - samples/sec: 1204.91 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:31:45,304 epoch 10 - iter 712/894 - loss 0.00426761 - time (sec): 56.86 - samples/sec: 1208.58 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:31:52,251 epoch 10 - iter 801/894 - loss 0.00417570 - time (sec): 63.81 - samples/sec: 1205.29 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:31:59,363 epoch 10 - iter 890/894 - loss 0.00387118 - time (sec): 70.92 - samples/sec: 1214.02 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:31:59,677 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:31:59,677 EPOCH 10 done: loss 0.0039 - lr: 0.000000 2023-10-17 16:32:11,414 DEV : loss 0.24554787576198578 - f1-score (micro avg) 0.7947 2023-10-17 16:32:11,470 saving best model 2023-10-17 16:32:13,374 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:32:13,376 Loading model from best epoch ... 2023-10-17 16:32:15,657 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-17 16:32:21,664 Results: - F-score (micro) 0.7665 - F-score (macro) 0.6658 - Accuracy 0.6401 By class: precision recall f1-score support loc 0.8478 0.8691 0.8583 596 pers 0.7116 0.8078 0.7567 333 org 0.5207 0.4773 0.4980 132 prod 0.5849 0.4697 0.5210 66 time 0.7174 0.6735 0.6947 49 micro avg 0.7560 0.7772 0.7665 1176 macro avg 0.6765 0.6595 0.6658 1176 weighted avg 0.7523 0.7772 0.7634 1176 2023-10-17 16:32:21,665 ----------------------------------------------------------------------------------------------------