stefan-it's picture
Upload folder using huggingface_hub
42f3775
2023-10-17 11:00:14,315 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,316 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 11:00:14,317 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,317 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 11:00:14,317 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,317 Train: 6183 sentences
2023-10-17 11:00:14,317 (train_with_dev=False, train_with_test=False)
2023-10-17 11:00:14,317 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,317 Training Params:
2023-10-17 11:00:14,317 - learning_rate: "5e-05"
2023-10-17 11:00:14,317 - mini_batch_size: "8"
2023-10-17 11:00:14,317 - max_epochs: "10"
2023-10-17 11:00:14,317 - shuffle: "True"
2023-10-17 11:00:14,317 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,318 Plugins:
2023-10-17 11:00:14,318 - TensorboardLogger
2023-10-17 11:00:14,318 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 11:00:14,318 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,318 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 11:00:14,318 - metric: "('micro avg', 'f1-score')"
2023-10-17 11:00:14,318 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,318 Computation:
2023-10-17 11:00:14,318 - compute on device: cuda:0
2023-10-17 11:00:14,318 - embedding storage: none
2023-10-17 11:00:14,318 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,318 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 11:00:14,318 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,318 ----------------------------------------------------------------------------------------------------
2023-10-17 11:00:14,318 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 11:00:21,885 epoch 1 - iter 77/773 - loss 2.45093293 - time (sec): 7.56 - samples/sec: 1493.76 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:00:29,944 epoch 1 - iter 154/773 - loss 1.24837000 - time (sec): 15.62 - samples/sec: 1587.57 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:00:37,607 epoch 1 - iter 231/773 - loss 0.88720878 - time (sec): 23.29 - samples/sec: 1621.94 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:00:45,319 epoch 1 - iter 308/773 - loss 0.69711197 - time (sec): 31.00 - samples/sec: 1630.66 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:00:52,416 epoch 1 - iter 385/773 - loss 0.58170860 - time (sec): 38.10 - samples/sec: 1662.22 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:00:59,813 epoch 1 - iter 462/773 - loss 0.50112244 - time (sec): 45.49 - samples/sec: 1674.06 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:01:07,113 epoch 1 - iter 539/773 - loss 0.44948369 - time (sec): 52.79 - samples/sec: 1657.75 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:01:14,963 epoch 1 - iter 616/773 - loss 0.40797204 - time (sec): 60.64 - samples/sec: 1645.26 - lr: 0.000040 - momentum: 0.000000
2023-10-17 11:01:22,928 epoch 1 - iter 693/773 - loss 0.37438356 - time (sec): 68.61 - samples/sec: 1637.73 - lr: 0.000045 - momentum: 0.000000
2023-10-17 11:01:30,630 epoch 1 - iter 770/773 - loss 0.34972134 - time (sec): 76.31 - samples/sec: 1623.90 - lr: 0.000050 - momentum: 0.000000
2023-10-17 11:01:30,909 ----------------------------------------------------------------------------------------------------
2023-10-17 11:01:30,910 EPOCH 1 done: loss 0.3490 - lr: 0.000050
2023-10-17 11:01:33,275 DEV : loss 0.05666542798280716 - f1-score (micro avg) 0.7588
2023-10-17 11:01:33,307 saving best model
2023-10-17 11:01:33,850 ----------------------------------------------------------------------------------------------------
2023-10-17 11:01:41,159 epoch 2 - iter 77/773 - loss 0.10595151 - time (sec): 7.31 - samples/sec: 1665.95 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:01:48,833 epoch 2 - iter 154/773 - loss 0.09841306 - time (sec): 14.98 - samples/sec: 1681.95 - lr: 0.000049 - momentum: 0.000000
2023-10-17 11:01:56,463 epoch 2 - iter 231/773 - loss 0.09574344 - time (sec): 22.61 - samples/sec: 1633.46 - lr: 0.000048 - momentum: 0.000000
2023-10-17 11:02:04,412 epoch 2 - iter 308/773 - loss 0.09169926 - time (sec): 30.56 - samples/sec: 1613.33 - lr: 0.000048 - momentum: 0.000000
2023-10-17 11:02:12,560 epoch 2 - iter 385/773 - loss 0.08863266 - time (sec): 38.71 - samples/sec: 1583.60 - lr: 0.000047 - momentum: 0.000000
2023-10-17 11:02:20,339 epoch 2 - iter 462/773 - loss 0.08698644 - time (sec): 46.49 - samples/sec: 1577.22 - lr: 0.000047 - momentum: 0.000000
2023-10-17 11:02:28,249 epoch 2 - iter 539/773 - loss 0.08349280 - time (sec): 54.40 - samples/sec: 1581.51 - lr: 0.000046 - momentum: 0.000000
2023-10-17 11:02:35,216 epoch 2 - iter 616/773 - loss 0.08431562 - time (sec): 61.36 - samples/sec: 1604.20 - lr: 0.000046 - momentum: 0.000000
2023-10-17 11:02:42,366 epoch 2 - iter 693/773 - loss 0.08313786 - time (sec): 68.51 - samples/sec: 1621.52 - lr: 0.000045 - momentum: 0.000000
2023-10-17 11:02:50,101 epoch 2 - iter 770/773 - loss 0.08075274 - time (sec): 76.25 - samples/sec: 1625.77 - lr: 0.000044 - momentum: 0.000000
2023-10-17 11:02:50,377 ----------------------------------------------------------------------------------------------------
2023-10-17 11:02:50,377 EPOCH 2 done: loss 0.0807 - lr: 0.000044
2023-10-17 11:02:54,039 DEV : loss 0.05350477248430252 - f1-score (micro avg) 0.7425
2023-10-17 11:02:54,072 ----------------------------------------------------------------------------------------------------
2023-10-17 11:03:01,603 epoch 3 - iter 77/773 - loss 0.05640998 - time (sec): 7.53 - samples/sec: 1560.20 - lr: 0.000044 - momentum: 0.000000
2023-10-17 11:03:09,509 epoch 3 - iter 154/773 - loss 0.05629241 - time (sec): 15.43 - samples/sec: 1659.98 - lr: 0.000043 - momentum: 0.000000
2023-10-17 11:03:17,025 epoch 3 - iter 231/773 - loss 0.05247215 - time (sec): 22.95 - samples/sec: 1689.23 - lr: 0.000043 - momentum: 0.000000
2023-10-17 11:03:23,846 epoch 3 - iter 308/773 - loss 0.05112877 - time (sec): 29.77 - samples/sec: 1690.16 - lr: 0.000042 - momentum: 0.000000
2023-10-17 11:03:30,759 epoch 3 - iter 385/773 - loss 0.05111415 - time (sec): 36.68 - samples/sec: 1697.84 - lr: 0.000042 - momentum: 0.000000
2023-10-17 11:03:37,744 epoch 3 - iter 462/773 - loss 0.05066311 - time (sec): 43.67 - samples/sec: 1718.06 - lr: 0.000041 - momentum: 0.000000
2023-10-17 11:03:44,762 epoch 3 - iter 539/773 - loss 0.04970420 - time (sec): 50.69 - samples/sec: 1718.08 - lr: 0.000041 - momentum: 0.000000
2023-10-17 11:03:52,179 epoch 3 - iter 616/773 - loss 0.05056488 - time (sec): 58.10 - samples/sec: 1711.20 - lr: 0.000040 - momentum: 0.000000
2023-10-17 11:04:00,191 epoch 3 - iter 693/773 - loss 0.04984793 - time (sec): 66.12 - samples/sec: 1698.05 - lr: 0.000039 - momentum: 0.000000
2023-10-17 11:04:07,775 epoch 3 - iter 770/773 - loss 0.04912347 - time (sec): 73.70 - samples/sec: 1681.26 - lr: 0.000039 - momentum: 0.000000
2023-10-17 11:04:08,068 ----------------------------------------------------------------------------------------------------
2023-10-17 11:04:08,069 EPOCH 3 done: loss 0.0490 - lr: 0.000039
2023-10-17 11:04:11,130 DEV : loss 0.07729422301054001 - f1-score (micro avg) 0.7535
2023-10-17 11:04:11,162 ----------------------------------------------------------------------------------------------------
2023-10-17 11:04:18,431 epoch 4 - iter 77/773 - loss 0.04152581 - time (sec): 7.27 - samples/sec: 1784.91 - lr: 0.000038 - momentum: 0.000000
2023-10-17 11:04:25,836 epoch 4 - iter 154/773 - loss 0.04153601 - time (sec): 14.67 - samples/sec: 1725.78 - lr: 0.000038 - momentum: 0.000000
2023-10-17 11:04:33,460 epoch 4 - iter 231/773 - loss 0.03745722 - time (sec): 22.30 - samples/sec: 1670.11 - lr: 0.000037 - momentum: 0.000000
2023-10-17 11:04:40,594 epoch 4 - iter 308/773 - loss 0.03833193 - time (sec): 29.43 - samples/sec: 1674.11 - lr: 0.000037 - momentum: 0.000000
2023-10-17 11:04:48,298 epoch 4 - iter 385/773 - loss 0.03751315 - time (sec): 37.13 - samples/sec: 1673.41 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:04:55,464 epoch 4 - iter 462/773 - loss 0.04103089 - time (sec): 44.30 - samples/sec: 1679.69 - lr: 0.000036 - momentum: 0.000000
2023-10-17 11:05:02,532 epoch 4 - iter 539/773 - loss 0.04171505 - time (sec): 51.37 - samples/sec: 1686.35 - lr: 0.000035 - momentum: 0.000000
2023-10-17 11:05:09,597 epoch 4 - iter 616/773 - loss 0.04053471 - time (sec): 58.43 - samples/sec: 1704.07 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:05:16,908 epoch 4 - iter 693/773 - loss 0.04085799 - time (sec): 65.74 - samples/sec: 1703.30 - lr: 0.000034 - momentum: 0.000000
2023-10-17 11:05:23,845 epoch 4 - iter 770/773 - loss 0.03858981 - time (sec): 72.68 - samples/sec: 1703.53 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:05:24,115 ----------------------------------------------------------------------------------------------------
2023-10-17 11:05:24,115 EPOCH 4 done: loss 0.0386 - lr: 0.000033
2023-10-17 11:05:27,335 DEV : loss 0.10291995108127594 - f1-score (micro avg) 0.75
2023-10-17 11:05:27,368 ----------------------------------------------------------------------------------------------------
2023-10-17 11:05:34,468 epoch 5 - iter 77/773 - loss 0.02121432 - time (sec): 7.10 - samples/sec: 1722.34 - lr: 0.000033 - momentum: 0.000000
2023-10-17 11:05:41,809 epoch 5 - iter 154/773 - loss 0.02173581 - time (sec): 14.44 - samples/sec: 1656.77 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:05:49,386 epoch 5 - iter 231/773 - loss 0.02284222 - time (sec): 22.02 - samples/sec: 1623.47 - lr: 0.000032 - momentum: 0.000000
2023-10-17 11:05:57,449 epoch 5 - iter 308/773 - loss 0.02224722 - time (sec): 30.08 - samples/sec: 1599.93 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:06:05,538 epoch 5 - iter 385/773 - loss 0.02315292 - time (sec): 38.17 - samples/sec: 1584.64 - lr: 0.000031 - momentum: 0.000000
2023-10-17 11:06:13,313 epoch 5 - iter 462/773 - loss 0.02280264 - time (sec): 45.94 - samples/sec: 1586.27 - lr: 0.000030 - momentum: 0.000000
2023-10-17 11:06:21,346 epoch 5 - iter 539/773 - loss 0.02222341 - time (sec): 53.98 - samples/sec: 1605.57 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:06:29,147 epoch 5 - iter 616/773 - loss 0.02225636 - time (sec): 61.78 - samples/sec: 1604.81 - lr: 0.000029 - momentum: 0.000000
2023-10-17 11:06:36,844 epoch 5 - iter 693/773 - loss 0.02247422 - time (sec): 69.47 - samples/sec: 1598.18 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:06:44,690 epoch 5 - iter 770/773 - loss 0.02385905 - time (sec): 77.32 - samples/sec: 1601.97 - lr: 0.000028 - momentum: 0.000000
2023-10-17 11:06:44,982 ----------------------------------------------------------------------------------------------------
2023-10-17 11:06:44,983 EPOCH 5 done: loss 0.0239 - lr: 0.000028
2023-10-17 11:06:47,973 DEV : loss 0.10200614482164383 - f1-score (micro avg) 0.7755
2023-10-17 11:06:48,006 saving best model
2023-10-17 11:06:49,423 ----------------------------------------------------------------------------------------------------
2023-10-17 11:06:57,017 epoch 6 - iter 77/773 - loss 0.02265088 - time (sec): 7.59 - samples/sec: 1636.48 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:07:04,370 epoch 6 - iter 154/773 - loss 0.01754504 - time (sec): 14.94 - samples/sec: 1670.67 - lr: 0.000027 - momentum: 0.000000
2023-10-17 11:07:11,949 epoch 6 - iter 231/773 - loss 0.02091300 - time (sec): 22.52 - samples/sec: 1669.32 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:07:19,655 epoch 6 - iter 308/773 - loss 0.02004128 - time (sec): 30.23 - samples/sec: 1659.22 - lr: 0.000026 - momentum: 0.000000
2023-10-17 11:07:27,169 epoch 6 - iter 385/773 - loss 0.01990145 - time (sec): 37.74 - samples/sec: 1647.46 - lr: 0.000025 - momentum: 0.000000
2023-10-17 11:07:34,917 epoch 6 - iter 462/773 - loss 0.01815685 - time (sec): 45.49 - samples/sec: 1638.47 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:07:42,565 epoch 6 - iter 539/773 - loss 0.01761772 - time (sec): 53.14 - samples/sec: 1653.33 - lr: 0.000024 - momentum: 0.000000
2023-10-17 11:07:50,486 epoch 6 - iter 616/773 - loss 0.01780920 - time (sec): 61.06 - samples/sec: 1629.64 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:07:58,505 epoch 6 - iter 693/773 - loss 0.01725283 - time (sec): 69.08 - samples/sec: 1615.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 11:08:06,220 epoch 6 - iter 770/773 - loss 0.01731812 - time (sec): 76.79 - samples/sec: 1612.45 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:08:06,512 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:06,513 EPOCH 6 done: loss 0.0175 - lr: 0.000022
2023-10-17 11:08:09,372 DEV : loss 0.1064695492386818 - f1-score (micro avg) 0.7812
2023-10-17 11:08:09,402 saving best model
2023-10-17 11:08:10,827 ----------------------------------------------------------------------------------------------------
2023-10-17 11:08:19,013 epoch 7 - iter 77/773 - loss 0.00823321 - time (sec): 8.18 - samples/sec: 1534.57 - lr: 0.000022 - momentum: 0.000000
2023-10-17 11:08:27,207 epoch 7 - iter 154/773 - loss 0.00771650 - time (sec): 16.38 - samples/sec: 1596.41 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:08:34,794 epoch 7 - iter 231/773 - loss 0.00743188 - time (sec): 23.96 - samples/sec: 1601.35 - lr: 0.000021 - momentum: 0.000000
2023-10-17 11:08:42,358 epoch 7 - iter 308/773 - loss 0.00932527 - time (sec): 31.53 - samples/sec: 1601.51 - lr: 0.000020 - momentum: 0.000000
2023-10-17 11:08:49,801 epoch 7 - iter 385/773 - loss 0.01044298 - time (sec): 38.97 - samples/sec: 1607.78 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:08:57,085 epoch 7 - iter 462/773 - loss 0.01033137 - time (sec): 46.25 - samples/sec: 1603.80 - lr: 0.000019 - momentum: 0.000000
2023-10-17 11:09:04,079 epoch 7 - iter 539/773 - loss 0.01077934 - time (sec): 53.25 - samples/sec: 1616.94 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:09:10,970 epoch 7 - iter 616/773 - loss 0.01030466 - time (sec): 60.14 - samples/sec: 1644.61 - lr: 0.000018 - momentum: 0.000000
2023-10-17 11:09:18,516 epoch 7 - iter 693/773 - loss 0.01069559 - time (sec): 67.68 - samples/sec: 1651.42 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:09:26,294 epoch 7 - iter 770/773 - loss 0.01086256 - time (sec): 75.46 - samples/sec: 1642.31 - lr: 0.000017 - momentum: 0.000000
2023-10-17 11:09:26,580 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:26,580 EPOCH 7 done: loss 0.0109 - lr: 0.000017
2023-10-17 11:09:29,687 DEV : loss 0.1156071275472641 - f1-score (micro avg) 0.795
2023-10-17 11:09:29,716 saving best model
2023-10-17 11:09:30,324 ----------------------------------------------------------------------------------------------------
2023-10-17 11:09:37,985 epoch 8 - iter 77/773 - loss 0.00730225 - time (sec): 7.66 - samples/sec: 1614.48 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:09:45,493 epoch 8 - iter 154/773 - loss 0.00623081 - time (sec): 15.17 - samples/sec: 1667.90 - lr: 0.000016 - momentum: 0.000000
2023-10-17 11:09:52,733 epoch 8 - iter 231/773 - loss 0.00556449 - time (sec): 22.41 - samples/sec: 1656.66 - lr: 0.000015 - momentum: 0.000000
2023-10-17 11:10:00,021 epoch 8 - iter 308/773 - loss 0.00601470 - time (sec): 29.69 - samples/sec: 1667.09 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:10:07,775 epoch 8 - iter 385/773 - loss 0.00628428 - time (sec): 37.45 - samples/sec: 1664.34 - lr: 0.000014 - momentum: 0.000000
2023-10-17 11:10:14,899 epoch 8 - iter 462/773 - loss 0.00605850 - time (sec): 44.57 - samples/sec: 1683.47 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:10:22,424 epoch 8 - iter 539/773 - loss 0.00700050 - time (sec): 52.10 - samples/sec: 1672.76 - lr: 0.000013 - momentum: 0.000000
2023-10-17 11:10:29,379 epoch 8 - iter 616/773 - loss 0.00759703 - time (sec): 59.05 - samples/sec: 1679.44 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:10:36,671 epoch 8 - iter 693/773 - loss 0.00788866 - time (sec): 66.35 - samples/sec: 1683.27 - lr: 0.000012 - momentum: 0.000000
2023-10-17 11:10:43,865 epoch 8 - iter 770/773 - loss 0.00753285 - time (sec): 73.54 - samples/sec: 1682.46 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:10:44,140 ----------------------------------------------------------------------------------------------------
2023-10-17 11:10:44,141 EPOCH 8 done: loss 0.0076 - lr: 0.000011
2023-10-17 11:10:47,005 DEV : loss 0.12029711902141571 - f1-score (micro avg) 0.7907
2023-10-17 11:10:47,036 ----------------------------------------------------------------------------------------------------
2023-10-17 11:10:55,094 epoch 9 - iter 77/773 - loss 0.00421926 - time (sec): 8.06 - samples/sec: 1520.59 - lr: 0.000011 - momentum: 0.000000
2023-10-17 11:11:02,690 epoch 9 - iter 154/773 - loss 0.00644406 - time (sec): 15.65 - samples/sec: 1544.60 - lr: 0.000010 - momentum: 0.000000
2023-10-17 11:11:10,046 epoch 9 - iter 231/773 - loss 0.00582051 - time (sec): 23.01 - samples/sec: 1641.13 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:11:17,664 epoch 9 - iter 308/773 - loss 0.00577635 - time (sec): 30.63 - samples/sec: 1616.34 - lr: 0.000009 - momentum: 0.000000
2023-10-17 11:11:24,755 epoch 9 - iter 385/773 - loss 0.00505832 - time (sec): 37.72 - samples/sec: 1630.49 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:11:31,696 epoch 9 - iter 462/773 - loss 0.00461600 - time (sec): 44.66 - samples/sec: 1643.56 - lr: 0.000008 - momentum: 0.000000
2023-10-17 11:11:38,576 epoch 9 - iter 539/773 - loss 0.00451370 - time (sec): 51.54 - samples/sec: 1659.27 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:11:45,934 epoch 9 - iter 616/773 - loss 0.00431455 - time (sec): 58.90 - samples/sec: 1685.41 - lr: 0.000007 - momentum: 0.000000
2023-10-17 11:11:53,167 epoch 9 - iter 693/773 - loss 0.00441306 - time (sec): 66.13 - samples/sec: 1680.31 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:12:00,970 epoch 9 - iter 770/773 - loss 0.00435539 - time (sec): 73.93 - samples/sec: 1677.04 - lr: 0.000006 - momentum: 0.000000
2023-10-17 11:12:01,228 ----------------------------------------------------------------------------------------------------
2023-10-17 11:12:01,228 EPOCH 9 done: loss 0.0044 - lr: 0.000006
2023-10-17 11:12:04,174 DEV : loss 0.12349887937307358 - f1-score (micro avg) 0.7844
2023-10-17 11:12:04,207 ----------------------------------------------------------------------------------------------------
2023-10-17 11:12:11,747 epoch 10 - iter 77/773 - loss 0.00455782 - time (sec): 7.54 - samples/sec: 1732.92 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:12:19,108 epoch 10 - iter 154/773 - loss 0.00420301 - time (sec): 14.90 - samples/sec: 1648.39 - lr: 0.000005 - momentum: 0.000000
2023-10-17 11:12:26,608 epoch 10 - iter 231/773 - loss 0.00367766 - time (sec): 22.40 - samples/sec: 1622.02 - lr: 0.000004 - momentum: 0.000000
2023-10-17 11:12:33,675 epoch 10 - iter 308/773 - loss 0.00347476 - time (sec): 29.47 - samples/sec: 1663.99 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:12:41,045 epoch 10 - iter 385/773 - loss 0.00326384 - time (sec): 36.84 - samples/sec: 1660.68 - lr: 0.000003 - momentum: 0.000000
2023-10-17 11:12:49,048 epoch 10 - iter 462/773 - loss 0.00312465 - time (sec): 44.84 - samples/sec: 1633.05 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:12:57,102 epoch 10 - iter 539/773 - loss 0.00324546 - time (sec): 52.89 - samples/sec: 1628.44 - lr: 0.000002 - momentum: 0.000000
2023-10-17 11:13:05,165 epoch 10 - iter 616/773 - loss 0.00279880 - time (sec): 60.96 - samples/sec: 1644.46 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:13:12,609 epoch 10 - iter 693/773 - loss 0.00267431 - time (sec): 68.40 - samples/sec: 1634.31 - lr: 0.000001 - momentum: 0.000000
2023-10-17 11:13:20,306 epoch 10 - iter 770/773 - loss 0.00286368 - time (sec): 76.10 - samples/sec: 1628.24 - lr: 0.000000 - momentum: 0.000000
2023-10-17 11:13:20,580 ----------------------------------------------------------------------------------------------------
2023-10-17 11:13:20,581 EPOCH 10 done: loss 0.0029 - lr: 0.000000
2023-10-17 11:13:23,848 DEV : loss 0.12833575904369354 - f1-score (micro avg) 0.791
2023-10-17 11:13:24,507 ----------------------------------------------------------------------------------------------------
2023-10-17 11:13:24,509 Loading model from best epoch ...
2023-10-17 11:13:27,115 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 11:13:36,470
Results:
- F-score (micro) 0.8191
- F-score (macro) 0.7465
- Accuracy 0.7116
By class:
precision recall f1-score support
LOC 0.8524 0.8668 0.8595 946
BUILDING 0.6506 0.5838 0.6154 185
STREET 0.8478 0.6964 0.7647 56
micro avg 0.8237 0.8147 0.8191 1187
macro avg 0.7836 0.7157 0.7465 1187
weighted avg 0.8207 0.8147 0.8170 1187
2023-10-17 11:13:36,470 ----------------------------------------------------------------------------------------------------