|
2023-10-17 16:18:17,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,475 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 16:18:17,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,475 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-17 16:18:17,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,475 Train: 3575 sentences |
|
2023-10-17 16:18:17,475 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 16:18:17,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,475 Training Params: |
|
2023-10-17 16:18:17,476 - learning_rate: "3e-05" |
|
2023-10-17 16:18:17,476 - mini_batch_size: "4" |
|
2023-10-17 16:18:17,476 - max_epochs: "10" |
|
2023-10-17 16:18:17,476 - shuffle: "True" |
|
2023-10-17 16:18:17,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,476 Plugins: |
|
2023-10-17 16:18:17,476 - TensorboardLogger |
|
2023-10-17 16:18:17,476 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 16:18:17,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,476 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 16:18:17,476 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 16:18:17,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,476 Computation: |
|
2023-10-17 16:18:17,476 - compute on device: cuda:0 |
|
2023-10-17 16:18:17,476 - embedding storage: none |
|
2023-10-17 16:18:17,477 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,477 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 16:18:17,477 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,477 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:18:17,477 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 16:18:24,576 epoch 1 - iter 89/894 - loss 3.49098594 - time (sec): 7.10 - samples/sec: 1258.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 16:18:31,595 epoch 1 - iter 178/894 - loss 2.32366876 - time (sec): 14.12 - samples/sec: 1228.67 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 16:18:38,525 epoch 1 - iter 267/894 - loss 1.75529411 - time (sec): 21.05 - samples/sec: 1189.14 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 16:18:45,733 epoch 1 - iter 356/894 - loss 1.39116171 - time (sec): 28.25 - samples/sec: 1225.66 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 16:18:52,748 epoch 1 - iter 445/894 - loss 1.18927583 - time (sec): 35.27 - samples/sec: 1221.71 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:18:59,818 epoch 1 - iter 534/894 - loss 1.04847785 - time (sec): 42.34 - samples/sec: 1221.03 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 16:19:06,763 epoch 1 - iter 623/894 - loss 0.94878418 - time (sec): 49.28 - samples/sec: 1209.92 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 16:19:13,830 epoch 1 - iter 712/894 - loss 0.86276987 - time (sec): 56.35 - samples/sec: 1215.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:19:21,649 epoch 1 - iter 801/894 - loss 0.79019396 - time (sec): 64.17 - samples/sec: 1213.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 16:19:29,090 epoch 1 - iter 890/894 - loss 0.73480438 - time (sec): 71.61 - samples/sec: 1204.46 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 16:19:29,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:19:29,405 EPOCH 1 done: loss 0.7328 - lr: 0.000030 |
|
2023-10-17 16:19:35,528 DEV : loss 0.15951120853424072 - f1-score (micro avg) 0.6309 |
|
2023-10-17 16:19:35,586 saving best model |
|
2023-10-17 16:19:36,204 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:19:43,596 epoch 2 - iter 89/894 - loss 0.18396528 - time (sec): 7.39 - samples/sec: 1147.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 16:19:51,344 epoch 2 - iter 178/894 - loss 0.18794022 - time (sec): 15.14 - samples/sec: 1189.48 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 16:19:58,283 epoch 2 - iter 267/894 - loss 0.18661222 - time (sec): 22.08 - samples/sec: 1186.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 16:20:05,472 epoch 2 - iter 356/894 - loss 0.17708364 - time (sec): 29.26 - samples/sec: 1209.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 16:20:12,475 epoch 2 - iter 445/894 - loss 0.17032159 - time (sec): 36.27 - samples/sec: 1215.16 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 16:20:19,668 epoch 2 - iter 534/894 - loss 0.16365282 - time (sec): 43.46 - samples/sec: 1197.59 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 16:20:26,799 epoch 2 - iter 623/894 - loss 0.16022299 - time (sec): 50.59 - samples/sec: 1207.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 16:20:33,843 epoch 2 - iter 712/894 - loss 0.15724052 - time (sec): 57.64 - samples/sec: 1220.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 16:20:40,997 epoch 2 - iter 801/894 - loss 0.15532819 - time (sec): 64.79 - samples/sec: 1210.62 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 16:20:48,054 epoch 2 - iter 890/894 - loss 0.15430259 - time (sec): 71.85 - samples/sec: 1201.25 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 16:20:48,368 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:20:48,368 EPOCH 2 done: loss 0.1539 - lr: 0.000027 |
|
2023-10-17 16:20:59,492 DEV : loss 0.1809595227241516 - f1-score (micro avg) 0.7136 |
|
2023-10-17 16:20:59,549 saving best model |
|
2023-10-17 16:21:00,966 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:21:08,129 epoch 3 - iter 89/894 - loss 0.09331837 - time (sec): 7.16 - samples/sec: 1124.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 16:21:15,444 epoch 3 - iter 178/894 - loss 0.08405150 - time (sec): 14.47 - samples/sec: 1159.98 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 16:21:23,055 epoch 3 - iter 267/894 - loss 0.07605411 - time (sec): 22.08 - samples/sec: 1167.85 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 16:21:30,524 epoch 3 - iter 356/894 - loss 0.07990159 - time (sec): 29.55 - samples/sec: 1152.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 16:21:37,586 epoch 3 - iter 445/894 - loss 0.07898967 - time (sec): 36.61 - samples/sec: 1180.95 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 16:21:44,567 epoch 3 - iter 534/894 - loss 0.08478162 - time (sec): 43.60 - samples/sec: 1180.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 16:21:51,598 epoch 3 - iter 623/894 - loss 0.08323998 - time (sec): 50.63 - samples/sec: 1194.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:21:58,523 epoch 3 - iter 712/894 - loss 0.08445594 - time (sec): 57.55 - samples/sec: 1198.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:22:05,473 epoch 3 - iter 801/894 - loss 0.08776340 - time (sec): 64.50 - samples/sec: 1199.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:22:12,539 epoch 3 - iter 890/894 - loss 0.08785018 - time (sec): 71.57 - samples/sec: 1204.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 16:22:12,841 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:22:12,841 EPOCH 3 done: loss 0.0879 - lr: 0.000023 |
|
2023-10-17 16:22:24,398 DEV : loss 0.17476926743984222 - f1-score (micro avg) 0.7514 |
|
2023-10-17 16:22:24,459 saving best model |
|
2023-10-17 16:22:25,882 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:22:32,762 epoch 4 - iter 89/894 - loss 0.04063277 - time (sec): 6.88 - samples/sec: 1393.61 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 16:22:39,858 epoch 4 - iter 178/894 - loss 0.04513127 - time (sec): 13.97 - samples/sec: 1366.56 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 16:22:46,759 epoch 4 - iter 267/894 - loss 0.04728134 - time (sec): 20.87 - samples/sec: 1302.78 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 16:22:53,633 epoch 4 - iter 356/894 - loss 0.04603510 - time (sec): 27.75 - samples/sec: 1262.38 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 16:23:00,593 epoch 4 - iter 445/894 - loss 0.04807142 - time (sec): 34.71 - samples/sec: 1253.87 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 16:23:07,910 epoch 4 - iter 534/894 - loss 0.04880179 - time (sec): 42.02 - samples/sec: 1245.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 16:23:15,036 epoch 4 - iter 623/894 - loss 0.04793107 - time (sec): 49.15 - samples/sec: 1233.15 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 16:23:22,257 epoch 4 - iter 712/894 - loss 0.04983695 - time (sec): 56.37 - samples/sec: 1230.64 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 16:23:29,415 epoch 4 - iter 801/894 - loss 0.05080398 - time (sec): 63.53 - samples/sec: 1225.07 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 16:23:36,701 epoch 4 - iter 890/894 - loss 0.05295316 - time (sec): 70.81 - samples/sec: 1216.37 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 16:23:37,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:23:37,026 EPOCH 4 done: loss 0.0531 - lr: 0.000020 |
|
2023-10-17 16:23:48,561 DEV : loss 0.1792767345905304 - f1-score (micro avg) 0.784 |
|
2023-10-17 16:23:48,621 saving best model |
|
2023-10-17 16:23:50,016 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:23:57,097 epoch 5 - iter 89/894 - loss 0.03003141 - time (sec): 7.08 - samples/sec: 1184.99 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 16:24:04,339 epoch 5 - iter 178/894 - loss 0.03386181 - time (sec): 14.32 - samples/sec: 1258.85 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 16:24:11,294 epoch 5 - iter 267/894 - loss 0.03450992 - time (sec): 21.27 - samples/sec: 1259.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 16:24:18,300 epoch 5 - iter 356/894 - loss 0.03581250 - time (sec): 28.28 - samples/sec: 1242.93 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 16:24:25,225 epoch 5 - iter 445/894 - loss 0.03969636 - time (sec): 35.21 - samples/sec: 1227.42 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 16:24:32,350 epoch 5 - iter 534/894 - loss 0.04027827 - time (sec): 42.33 - samples/sec: 1233.12 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 16:24:39,848 epoch 5 - iter 623/894 - loss 0.04050683 - time (sec): 49.83 - samples/sec: 1217.98 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 16:24:46,961 epoch 5 - iter 712/894 - loss 0.03821864 - time (sec): 56.94 - samples/sec: 1224.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 16:24:54,046 epoch 5 - iter 801/894 - loss 0.03790059 - time (sec): 64.03 - samples/sec: 1216.47 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 16:25:01,137 epoch 5 - iter 890/894 - loss 0.03627111 - time (sec): 71.12 - samples/sec: 1213.26 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 16:25:01,446 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:25:01,447 EPOCH 5 done: loss 0.0362 - lr: 0.000017 |
|
2023-10-17 16:25:13,187 DEV : loss 0.1935582309961319 - f1-score (micro avg) 0.7777 |
|
2023-10-17 16:25:13,243 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:25:20,221 epoch 6 - iter 89/894 - loss 0.03711638 - time (sec): 6.98 - samples/sec: 1275.39 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 16:25:27,241 epoch 6 - iter 178/894 - loss 0.02750927 - time (sec): 14.00 - samples/sec: 1256.60 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 16:25:33,829 epoch 6 - iter 267/894 - loss 0.02860102 - time (sec): 20.58 - samples/sec: 1256.66 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 16:25:40,054 epoch 6 - iter 356/894 - loss 0.03148660 - time (sec): 26.81 - samples/sec: 1277.67 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:25:46,934 epoch 6 - iter 445/894 - loss 0.02760601 - time (sec): 33.69 - samples/sec: 1276.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:25:54,136 epoch 6 - iter 534/894 - loss 0.02699126 - time (sec): 40.89 - samples/sec: 1251.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:26:01,914 epoch 6 - iter 623/894 - loss 0.02489164 - time (sec): 48.67 - samples/sec: 1215.02 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 16:26:10,004 epoch 6 - iter 712/894 - loss 0.02413339 - time (sec): 56.76 - samples/sec: 1201.89 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 16:26:17,118 epoch 6 - iter 801/894 - loss 0.02224205 - time (sec): 63.87 - samples/sec: 1199.72 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 16:26:24,447 epoch 6 - iter 890/894 - loss 0.02153523 - time (sec): 71.20 - samples/sec: 1210.61 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 16:26:24,756 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:26:24,757 EPOCH 6 done: loss 0.0214 - lr: 0.000013 |
|
2023-10-17 16:26:36,294 DEV : loss 0.21073873341083527 - f1-score (micro avg) 0.7834 |
|
2023-10-17 16:26:36,351 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:26:43,526 epoch 7 - iter 89/894 - loss 0.01344321 - time (sec): 7.17 - samples/sec: 1213.35 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 16:26:50,431 epoch 7 - iter 178/894 - loss 0.00956923 - time (sec): 14.08 - samples/sec: 1167.69 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 16:26:57,578 epoch 7 - iter 267/894 - loss 0.01112114 - time (sec): 21.22 - samples/sec: 1178.79 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 16:27:04,765 epoch 7 - iter 356/894 - loss 0.01197922 - time (sec): 28.41 - samples/sec: 1200.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 16:27:11,688 epoch 7 - iter 445/894 - loss 0.01450837 - time (sec): 35.33 - samples/sec: 1208.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 16:27:18,436 epoch 7 - iter 534/894 - loss 0.01471580 - time (sec): 42.08 - samples/sec: 1226.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 16:27:25,406 epoch 7 - iter 623/894 - loss 0.01479523 - time (sec): 49.05 - samples/sec: 1226.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 16:27:32,663 epoch 7 - iter 712/894 - loss 0.01445821 - time (sec): 56.31 - samples/sec: 1231.00 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 16:27:39,657 epoch 7 - iter 801/894 - loss 0.01556922 - time (sec): 63.30 - samples/sec: 1234.27 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 16:27:46,756 epoch 7 - iter 890/894 - loss 0.01512635 - time (sec): 70.40 - samples/sec: 1222.90 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 16:27:47,082 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:27:47,082 EPOCH 7 done: loss 0.0151 - lr: 0.000010 |
|
2023-10-17 16:27:58,328 DEV : loss 0.22550107538700104 - f1-score (micro avg) 0.7899 |
|
2023-10-17 16:27:58,394 saving best model |
|
2023-10-17 16:27:59,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:28:06,882 epoch 8 - iter 89/894 - loss 0.00735787 - time (sec): 7.08 - samples/sec: 1243.80 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 16:28:13,927 epoch 8 - iter 178/894 - loss 0.01365307 - time (sec): 14.12 - samples/sec: 1205.85 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 16:28:21,045 epoch 8 - iter 267/894 - loss 0.01185970 - time (sec): 21.24 - samples/sec: 1194.91 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 16:28:28,164 epoch 8 - iter 356/894 - loss 0.01188072 - time (sec): 28.36 - samples/sec: 1184.91 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 16:28:35,573 epoch 8 - iter 445/894 - loss 0.01063050 - time (sec): 35.77 - samples/sec: 1219.19 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 16:28:43,036 epoch 8 - iter 534/894 - loss 0.01071909 - time (sec): 43.23 - samples/sec: 1214.11 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 16:28:50,128 epoch 8 - iter 623/894 - loss 0.00992864 - time (sec): 50.32 - samples/sec: 1224.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 16:28:57,943 epoch 8 - iter 712/894 - loss 0.00975064 - time (sec): 58.14 - samples/sec: 1198.00 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 16:29:05,257 epoch 8 - iter 801/894 - loss 0.01030314 - time (sec): 65.45 - samples/sec: 1192.10 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 16:29:12,309 epoch 8 - iter 890/894 - loss 0.01034406 - time (sec): 72.50 - samples/sec: 1190.39 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 16:29:12,613 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:29:12,613 EPOCH 8 done: loss 0.0103 - lr: 0.000007 |
|
2023-10-17 16:29:23,870 DEV : loss 0.2575424909591675 - f1-score (micro avg) 0.7861 |
|
2023-10-17 16:29:23,936 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:29:31,056 epoch 9 - iter 89/894 - loss 0.00456394 - time (sec): 7.12 - samples/sec: 1194.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 16:29:37,954 epoch 9 - iter 178/894 - loss 0.01007170 - time (sec): 14.02 - samples/sec: 1222.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 16:29:44,870 epoch 9 - iter 267/894 - loss 0.01043709 - time (sec): 20.93 - samples/sec: 1187.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 16:29:51,939 epoch 9 - iter 356/894 - loss 0.00821020 - time (sec): 28.00 - samples/sec: 1208.05 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 16:29:59,410 epoch 9 - iter 445/894 - loss 0.00750055 - time (sec): 35.47 - samples/sec: 1203.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 16:30:06,520 epoch 9 - iter 534/894 - loss 0.00728371 - time (sec): 42.58 - samples/sec: 1212.36 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 16:30:13,357 epoch 9 - iter 623/894 - loss 0.00665355 - time (sec): 49.42 - samples/sec: 1215.10 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 16:30:20,354 epoch 9 - iter 712/894 - loss 0.00712338 - time (sec): 56.41 - samples/sec: 1218.78 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 16:30:27,700 epoch 9 - iter 801/894 - loss 0.00673904 - time (sec): 63.76 - samples/sec: 1217.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 16:30:34,836 epoch 9 - iter 890/894 - loss 0.00673433 - time (sec): 70.90 - samples/sec: 1216.48 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 16:30:35,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:30:35,144 EPOCH 9 done: loss 0.0067 - lr: 0.000003 |
|
2023-10-17 16:30:46,951 DEV : loss 0.2568037509918213 - f1-score (micro avg) 0.7908 |
|
2023-10-17 16:30:47,017 saving best model |
|
2023-10-17 16:30:48,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:30:55,777 epoch 10 - iter 89/894 - loss 0.00138104 - time (sec): 7.33 - samples/sec: 1254.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 16:31:02,774 epoch 10 - iter 178/894 - loss 0.00366289 - time (sec): 14.33 - samples/sec: 1208.69 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 16:31:09,867 epoch 10 - iter 267/894 - loss 0.00365606 - time (sec): 21.42 - samples/sec: 1185.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 16:31:16,830 epoch 10 - iter 356/894 - loss 0.00343159 - time (sec): 28.39 - samples/sec: 1198.61 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 16:31:23,966 epoch 10 - iter 445/894 - loss 0.00426874 - time (sec): 35.52 - samples/sec: 1202.62 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 16:31:31,313 epoch 10 - iter 534/894 - loss 0.00431619 - time (sec): 42.87 - samples/sec: 1218.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 16:31:38,177 epoch 10 - iter 623/894 - loss 0.00428653 - time (sec): 49.73 - samples/sec: 1204.91 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 16:31:45,304 epoch 10 - iter 712/894 - loss 0.00426761 - time (sec): 56.86 - samples/sec: 1208.58 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 16:31:52,251 epoch 10 - iter 801/894 - loss 0.00417570 - time (sec): 63.81 - samples/sec: 1205.29 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 16:31:59,363 epoch 10 - iter 890/894 - loss 0.00387118 - time (sec): 70.92 - samples/sec: 1214.02 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 16:31:59,677 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:31:59,677 EPOCH 10 done: loss 0.0039 - lr: 0.000000 |
|
2023-10-17 16:32:11,414 DEV : loss 0.24554787576198578 - f1-score (micro avg) 0.7947 |
|
2023-10-17 16:32:11,470 saving best model |
|
2023-10-17 16:32:13,374 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:32:13,376 Loading model from best epoch ... |
|
2023-10-17 16:32:15,657 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-17 16:32:21,664 |
|
Results: |
|
- F-score (micro) 0.7665 |
|
- F-score (macro) 0.6658 |
|
- Accuracy 0.6401 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8478 0.8691 0.8583 596 |
|
pers 0.7116 0.8078 0.7567 333 |
|
org 0.5207 0.4773 0.4980 132 |
|
prod 0.5849 0.4697 0.5210 66 |
|
time 0.7174 0.6735 0.6947 49 |
|
|
|
micro avg 0.7560 0.7772 0.7665 1176 |
|
macro avg 0.6765 0.6595 0.6658 1176 |
|
weighted avg 0.7523 0.7772 0.7634 1176 |
|
|
|
2023-10-17 16:32:21,665 ---------------------------------------------------------------------------------------------------- |
|
|