stefan-it's picture
Upload folder using huggingface_hub
4fbd626
2023-10-17 11:57:40,102 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,104 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 11:57:40,105 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,105 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 11:57:40,105 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,105 Train: 6183 sentences
2023-10-17 11:57:40,105 (train_with_dev=False, train_with_test=False)
2023-10-17 11:57:40,105 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,105 Training Params:
2023-10-17 11:57:40,105 - learning_rate: "3e-05"
2023-10-17 11:57:40,105 - mini_batch_size: "8"
2023-10-17 11:57:40,105 - max_epochs: "10"
2023-10-17 11:57:40,105 - shuffle: "True"
2023-10-17 11:57:40,105 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,105 Plugins:
2023-10-17 11:57:40,106 - TensorboardLogger
2023-10-17 11:57:40,106 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 11:57:40,106 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,106 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 11:57:40,106 - metric: "('micro avg', 'f1-score')"
2023-10-17 11:57:40,106 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,106 Computation:
2023-10-17 11:57:40,106 - compute on device: cuda:0
2023-10-17 11:57:40,106 - embedding storage: none
2023-10-17 11:57:40,106 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,106 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 11:57:40,106 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,106 ----------------------------------------------------------------------------------------------------
2023-10-17 11:57:40,106 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 11:57:47,156 epoch 1 - iter 77/773 - loss 2.62934140 - time (sec): 7.05 - samples/sec: 1746.12 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:57:54,316 epoch 1 - iter 154/773 - loss 1.60891642 - time (sec): 14.21 - samples/sec: 1734.16 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:58:01,542 epoch 1 - iter 231/773 - loss 1.12039514 - time (sec): 21.43 - samples/sec: 1746.74 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:58:08,974 epoch 1 - iter 308/773 - loss 0.86475702 - time (sec): 28.87 - samples/sec: 1736.86 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:58:16,969 epoch 1 - iter 385/773 - loss 0.71122671 - time (sec): 36.86 - samples/sec: 1696.64 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:58:24,770 epoch 1 - iter 462/773 - loss 0.62105387 - time (sec): 44.66 - samples/sec: 1659.62 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:58:31,943 epoch 1 - iter 539/773 - loss 0.55330225 - time (sec): 51.83 - samples/sec: 1650.46 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:58:38,992 epoch 1 - iter 616/773 - loss 0.49625437 - time (sec): 58.88 - samples/sec: 1666.43 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:58:47,296 epoch 1 - iter 693/773 - loss 0.44972124 - time (sec): 67.19 - samples/sec: 1655.14 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:58:54,547 epoch 1 - iter 770/773 - loss 0.41433052 - time (sec): 74.44 - samples/sec: 1661.92 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:58:54,816 ----------------------------------------------------------------------------------------------------
2023-10-17 11:58:54,816 EPOCH 1 done: loss 0.4127 - lr: 0.000030
2023-10-17 11:58:57,267 DEV : loss 0.06118296831846237 - f1-score (micro avg) 0.758
2023-10-17 11:58:57,297 saving best model
2023-10-17 11:58:57,894 ----------------------------------------------------------------------------------------------------
2023-10-17 11:59:05,101 epoch 2 - iter 77/773 - loss 0.08295422 - time (sec): 7.20 - samples/sec: 1645.87 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:59:13,021 epoch 2 - iter 154/773 - loss 0.07699183 - time (sec): 15.12 - samples/sec: 1586.84 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:59:20,476 epoch 2 - iter 231/773 - loss 0.08084633 - time (sec): 22.58 - samples/sec: 1613.56 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:59:27,517 epoch 2 - iter 308/773 - loss 0.08317390 - time (sec): 29.62 - samples/sec: 1670.58 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:59:34,890 epoch 2 - iter 385/773 - loss 0.07884677 - time (sec): 36.99 - samples/sec: 1662.50 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:59:42,308 epoch 2 - iter 462/773 - loss 0.07771188 - time (sec): 44.41 - samples/sec: 1677.87 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:59:49,183 epoch 2 - iter 539/773 - loss 0.07602078 - time (sec): 51.29 - samples/sec: 1687.25 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:59:56,592 epoch 2 - iter 616/773 - loss 0.07570427 - time (sec): 58.70 - samples/sec: 1671.75 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:00:04,319 epoch 2 - iter 693/773 - loss 0.07508718 - time (sec): 66.42 - samples/sec: 1686.70 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:00:11,409 epoch 2 - iter 770/773 - loss 0.07433537 - time (sec): 73.51 - samples/sec: 1684.91 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:00:11,700 ----------------------------------------------------------------------------------------------------
2023-10-17 12:00:11,701 EPOCH 2 done: loss 0.0749 - lr: 0.000027
2023-10-17 12:00:14,996 DEV : loss 0.05837954208254814 - f1-score (micro avg) 0.7863
2023-10-17 12:00:15,033 saving best model
2023-10-17 12:00:16,970 ----------------------------------------------------------------------------------------------------
2023-10-17 12:00:24,464 epoch 3 - iter 77/773 - loss 0.04678876 - time (sec): 7.49 - samples/sec: 1585.33 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:00:31,494 epoch 3 - iter 154/773 - loss 0.04394092 - time (sec): 14.52 - samples/sec: 1587.83 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:00:38,728 epoch 3 - iter 231/773 - loss 0.04633664 - time (sec): 21.75 - samples/sec: 1633.15 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:00:46,112 epoch 3 - iter 308/773 - loss 0.05065715 - time (sec): 29.14 - samples/sec: 1663.43 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:00:53,391 epoch 3 - iter 385/773 - loss 0.04995056 - time (sec): 36.42 - samples/sec: 1682.30 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:01:00,605 epoch 3 - iter 462/773 - loss 0.04932353 - time (sec): 43.63 - samples/sec: 1689.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:01:08,272 epoch 3 - iter 539/773 - loss 0.04798072 - time (sec): 51.30 - samples/sec: 1679.29 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:01:16,193 epoch 3 - iter 616/773 - loss 0.04702234 - time (sec): 59.22 - samples/sec: 1664.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:01:23,550 epoch 3 - iter 693/773 - loss 0.04733206 - time (sec): 66.58 - samples/sec: 1670.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:01:30,536 epoch 3 - iter 770/773 - loss 0.04801480 - time (sec): 73.56 - samples/sec: 1684.20 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:01:30,791 ----------------------------------------------------------------------------------------------------
2023-10-17 12:01:30,791 EPOCH 3 done: loss 0.0480 - lr: 0.000023
2023-10-17 12:01:33,682 DEV : loss 0.06652045249938965 - f1-score (micro avg) 0.7692
2023-10-17 12:01:33,712 ----------------------------------------------------------------------------------------------------
2023-10-17 12:01:40,804 epoch 4 - iter 77/773 - loss 0.02865288 - time (sec): 7.09 - samples/sec: 1830.33 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:01:47,995 epoch 4 - iter 154/773 - loss 0.02826719 - time (sec): 14.28 - samples/sec: 1846.02 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:01:55,594 epoch 4 - iter 231/773 - loss 0.03000062 - time (sec): 21.88 - samples/sec: 1747.28 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:02:03,346 epoch 4 - iter 308/773 - loss 0.02857205 - time (sec): 29.63 - samples/sec: 1702.07 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:02:10,474 epoch 4 - iter 385/773 - loss 0.03006392 - time (sec): 36.76 - samples/sec: 1704.74 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:02:17,821 epoch 4 - iter 462/773 - loss 0.03064035 - time (sec): 44.11 - samples/sec: 1698.51 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:02:25,847 epoch 4 - iter 539/773 - loss 0.03032994 - time (sec): 52.13 - samples/sec: 1662.31 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:02:33,216 epoch 4 - iter 616/773 - loss 0.02951093 - time (sec): 59.50 - samples/sec: 1652.69 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:02:40,366 epoch 4 - iter 693/773 - loss 0.03013193 - time (sec): 66.65 - samples/sec: 1671.24 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:02:47,380 epoch 4 - iter 770/773 - loss 0.03053470 - time (sec): 73.67 - samples/sec: 1679.68 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:02:47,663 ----------------------------------------------------------------------------------------------------
2023-10-17 12:02:47,664 EPOCH 4 done: loss 0.0304 - lr: 0.000020
2023-10-17 12:02:50,759 DEV : loss 0.07929900288581848 - f1-score (micro avg) 0.7817
2023-10-17 12:02:50,793 ----------------------------------------------------------------------------------------------------
2023-10-17 12:02:57,890 epoch 5 - iter 77/773 - loss 0.02217696 - time (sec): 7.10 - samples/sec: 1726.68 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:03:05,723 epoch 5 - iter 154/773 - loss 0.01690662 - time (sec): 14.93 - samples/sec: 1715.44 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:03:12,550 epoch 5 - iter 231/773 - loss 0.02055394 - time (sec): 21.75 - samples/sec: 1703.10 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:03:19,440 epoch 5 - iter 308/773 - loss 0.02201265 - time (sec): 28.65 - samples/sec: 1705.46 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:03:26,685 epoch 5 - iter 385/773 - loss 0.02194969 - time (sec): 35.89 - samples/sec: 1699.08 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:03:34,360 epoch 5 - iter 462/773 - loss 0.02236795 - time (sec): 43.57 - samples/sec: 1683.10 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:03:42,008 epoch 5 - iter 539/773 - loss 0.02204855 - time (sec): 51.21 - samples/sec: 1682.44 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:03:49,369 epoch 5 - iter 616/773 - loss 0.02258293 - time (sec): 58.57 - samples/sec: 1688.36 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:03:56,358 epoch 5 - iter 693/773 - loss 0.02279416 - time (sec): 65.56 - samples/sec: 1697.30 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:04:03,656 epoch 5 - iter 770/773 - loss 0.02288678 - time (sec): 72.86 - samples/sec: 1695.26 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:04:03,956 ----------------------------------------------------------------------------------------------------
2023-10-17 12:04:03,957 EPOCH 5 done: loss 0.0228 - lr: 0.000017
2023-10-17 12:04:06,846 DEV : loss 0.09690136462450027 - f1-score (micro avg) 0.8083
2023-10-17 12:04:06,878 saving best model
2023-10-17 12:04:08,284 ----------------------------------------------------------------------------------------------------
2023-10-17 12:04:15,535 epoch 6 - iter 77/773 - loss 0.01574627 - time (sec): 7.25 - samples/sec: 1634.93 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:04:22,680 epoch 6 - iter 154/773 - loss 0.01503167 - time (sec): 14.39 - samples/sec: 1606.59 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:04:30,417 epoch 6 - iter 231/773 - loss 0.01525419 - time (sec): 22.13 - samples/sec: 1619.30 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:04:37,501 epoch 6 - iter 308/773 - loss 0.01604884 - time (sec): 29.21 - samples/sec: 1681.45 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:04:44,519 epoch 6 - iter 385/773 - loss 0.01647546 - time (sec): 36.23 - samples/sec: 1699.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:04:51,629 epoch 6 - iter 462/773 - loss 0.01649118 - time (sec): 43.34 - samples/sec: 1703.29 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:04:58,910 epoch 6 - iter 539/773 - loss 0.01640551 - time (sec): 50.62 - samples/sec: 1708.22 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:05:06,068 epoch 6 - iter 616/773 - loss 0.01616345 - time (sec): 57.78 - samples/sec: 1709.61 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:05:12,936 epoch 6 - iter 693/773 - loss 0.01661672 - time (sec): 64.65 - samples/sec: 1723.18 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:05:19,659 epoch 6 - iter 770/773 - loss 0.01623904 - time (sec): 71.37 - samples/sec: 1733.95 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:05:19,954 ----------------------------------------------------------------------------------------------------
2023-10-17 12:05:19,954 EPOCH 6 done: loss 0.0162 - lr: 0.000013
2023-10-17 12:05:22,876 DEV : loss 0.11253025382757187 - f1-score (micro avg) 0.7865
2023-10-17 12:05:22,905 ----------------------------------------------------------------------------------------------------
2023-10-17 12:05:29,670 epoch 7 - iter 77/773 - loss 0.01369959 - time (sec): 6.76 - samples/sec: 1740.54 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:05:36,767 epoch 7 - iter 154/773 - loss 0.01009396 - time (sec): 13.86 - samples/sec: 1732.16 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:05:43,890 epoch 7 - iter 231/773 - loss 0.00886952 - time (sec): 20.98 - samples/sec: 1749.26 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:05:50,813 epoch 7 - iter 308/773 - loss 0.00877763 - time (sec): 27.91 - samples/sec: 1739.80 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:05:57,700 epoch 7 - iter 385/773 - loss 0.00929459 - time (sec): 34.79 - samples/sec: 1733.43 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:06:04,692 epoch 7 - iter 462/773 - loss 0.00883992 - time (sec): 41.79 - samples/sec: 1743.78 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:06:12,154 epoch 7 - iter 539/773 - loss 0.00966097 - time (sec): 49.25 - samples/sec: 1769.92 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:06:19,005 epoch 7 - iter 616/773 - loss 0.00980374 - time (sec): 56.10 - samples/sec: 1767.70 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:06:25,948 epoch 7 - iter 693/773 - loss 0.01034364 - time (sec): 63.04 - samples/sec: 1764.13 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:06:33,616 epoch 7 - iter 770/773 - loss 0.01133078 - time (sec): 70.71 - samples/sec: 1747.32 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:06:33,975 ----------------------------------------------------------------------------------------------------
2023-10-17 12:06:33,975 EPOCH 7 done: loss 0.0114 - lr: 0.000010
2023-10-17 12:06:37,067 DEV : loss 0.11950261145830154 - f1-score (micro avg) 0.795
2023-10-17 12:06:37,096 ----------------------------------------------------------------------------------------------------
2023-10-17 12:06:44,893 epoch 8 - iter 77/773 - loss 0.00625140 - time (sec): 7.79 - samples/sec: 1497.23 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:06:51,920 epoch 8 - iter 154/773 - loss 0.00852418 - time (sec): 14.82 - samples/sec: 1633.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:06:58,955 epoch 8 - iter 231/773 - loss 0.00806622 - time (sec): 21.86 - samples/sec: 1629.72 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:07:05,728 epoch 8 - iter 308/773 - loss 0.00853359 - time (sec): 28.63 - samples/sec: 1652.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:07:12,918 epoch 8 - iter 385/773 - loss 0.00786367 - time (sec): 35.82 - samples/sec: 1688.05 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:07:20,139 epoch 8 - iter 462/773 - loss 0.00702458 - time (sec): 43.04 - samples/sec: 1715.38 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:07:27,147 epoch 8 - iter 539/773 - loss 0.00704735 - time (sec): 50.05 - samples/sec: 1722.89 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:07:34,445 epoch 8 - iter 616/773 - loss 0.00717121 - time (sec): 57.35 - samples/sec: 1727.76 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:07:41,580 epoch 8 - iter 693/773 - loss 0.00713439 - time (sec): 64.48 - samples/sec: 1724.76 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:07:49,000 epoch 8 - iter 770/773 - loss 0.00703776 - time (sec): 71.90 - samples/sec: 1721.06 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:07:49,276 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:49,276 EPOCH 8 done: loss 0.0070 - lr: 0.000007
2023-10-17 12:07:52,369 DEV : loss 0.12206191569566727 - f1-score (micro avg) 0.7942
2023-10-17 12:07:52,400 ----------------------------------------------------------------------------------------------------
2023-10-17 12:07:59,464 epoch 9 - iter 77/773 - loss 0.00424893 - time (sec): 7.06 - samples/sec: 1783.62 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:08:06,976 epoch 9 - iter 154/773 - loss 0.00355699 - time (sec): 14.57 - samples/sec: 1795.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:08:13,824 epoch 9 - iter 231/773 - loss 0.00385022 - time (sec): 21.42 - samples/sec: 1774.45 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:08:20,704 epoch 9 - iter 308/773 - loss 0.00363732 - time (sec): 28.30 - samples/sec: 1791.32 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:08:27,824 epoch 9 - iter 385/773 - loss 0.00344109 - time (sec): 35.42 - samples/sec: 1778.24 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:08:34,742 epoch 9 - iter 462/773 - loss 0.00372679 - time (sec): 42.34 - samples/sec: 1761.91 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:08:41,980 epoch 9 - iter 539/773 - loss 0.00415038 - time (sec): 49.58 - samples/sec: 1752.63 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:08:49,436 epoch 9 - iter 616/773 - loss 0.00418043 - time (sec): 57.03 - samples/sec: 1746.10 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:08:56,848 epoch 9 - iter 693/773 - loss 0.00417050 - time (sec): 64.45 - samples/sec: 1747.90 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:09:04,074 epoch 9 - iter 770/773 - loss 0.00447844 - time (sec): 71.67 - samples/sec: 1726.86 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:09:04,352 ----------------------------------------------------------------------------------------------------
2023-10-17 12:09:04,353 EPOCH 9 done: loss 0.0045 - lr: 0.000003
2023-10-17 12:09:07,443 DEV : loss 0.13099931180477142 - f1-score (micro avg) 0.7967
2023-10-17 12:09:07,475 ----------------------------------------------------------------------------------------------------
2023-10-17 12:09:14,596 epoch 10 - iter 77/773 - loss 0.00280890 - time (sec): 7.12 - samples/sec: 1699.23 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:09:22,633 epoch 10 - iter 154/773 - loss 0.00194984 - time (sec): 15.16 - samples/sec: 1701.26 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:09:29,624 epoch 10 - iter 231/773 - loss 0.00196051 - time (sec): 22.15 - samples/sec: 1728.94 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:09:36,424 epoch 10 - iter 308/773 - loss 0.00257133 - time (sec): 28.95 - samples/sec: 1732.49 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:09:43,620 epoch 10 - iter 385/773 - loss 0.00314051 - time (sec): 36.14 - samples/sec: 1741.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:09:50,669 epoch 10 - iter 462/773 - loss 0.00334541 - time (sec): 43.19 - samples/sec: 1755.01 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:09:57,728 epoch 10 - iter 539/773 - loss 0.00324725 - time (sec): 50.25 - samples/sec: 1753.88 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:10:04,717 epoch 10 - iter 616/773 - loss 0.00353126 - time (sec): 57.24 - samples/sec: 1728.44 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:10:11,916 epoch 10 - iter 693/773 - loss 0.00325470 - time (sec): 64.44 - samples/sec: 1736.98 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:10:19,023 epoch 10 - iter 770/773 - loss 0.00330436 - time (sec): 71.55 - samples/sec: 1730.28 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:10:19,309 ----------------------------------------------------------------------------------------------------
2023-10-17 12:10:19,310 EPOCH 10 done: loss 0.0033 - lr: 0.000000
2023-10-17 12:10:22,291 DEV : loss 0.1352890431880951 - f1-score (micro avg) 0.7844
2023-10-17 12:10:22,951 ----------------------------------------------------------------------------------------------------
2023-10-17 12:10:22,953 Loading model from best epoch ...
2023-10-17 12:10:25,332 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 12:10:33,825
Results:
- F-score (micro) 0.8057
- F-score (macro) 0.7196
- Accuracy 0.6904
By class:
precision recall f1-score support
LOC 0.8495 0.8414 0.8455 946
BUILDING 0.6795 0.5730 0.6217 185
STREET 0.7255 0.6607 0.6916 56
micro avg 0.8208 0.7911 0.8057 1187
macro avg 0.7515 0.6917 0.7196 1187
weighted avg 0.8172 0.7911 0.8033 1187
2023-10-17 12:10:33,825 ----------------------------------------------------------------------------------------------------