2023-10-17 11:00:14,315 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,316 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 11:00:14,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,317 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 11:00:14,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,317 Train: 6183 sentences 2023-10-17 11:00:14,317 (train_with_dev=False, train_with_test=False) 2023-10-17 11:00:14,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,317 Training Params: 2023-10-17 11:00:14,317 - learning_rate: "5e-05" 2023-10-17 11:00:14,317 - mini_batch_size: "8" 2023-10-17 11:00:14,317 - max_epochs: "10" 2023-10-17 11:00:14,317 - shuffle: "True" 2023-10-17 11:00:14,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,318 Plugins: 2023-10-17 11:00:14,318 - TensorboardLogger 2023-10-17 11:00:14,318 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 11:00:14,318 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,318 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 11:00:14,318 - metric: "('micro avg', 'f1-score')" 2023-10-17 11:00:14,318 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,318 Computation: 2023-10-17 11:00:14,318 - compute on device: cuda:0 2023-10-17 11:00:14,318 - embedding storage: none 2023-10-17 11:00:14,318 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,318 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 11:00:14,318 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,318 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:00:14,318 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 11:00:21,885 epoch 1 - iter 77/773 - loss 2.45093293 - time (sec): 7.56 - samples/sec: 1493.76 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:00:29,944 epoch 1 - iter 154/773 - loss 1.24837000 - time (sec): 15.62 - samples/sec: 1587.57 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:00:37,607 epoch 1 - iter 231/773 - loss 0.88720878 - time (sec): 23.29 - samples/sec: 1621.94 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:00:45,319 epoch 1 - iter 308/773 - loss 0.69711197 - time (sec): 31.00 - samples/sec: 1630.66 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:00:52,416 epoch 1 - iter 385/773 - loss 0.58170860 - time (sec): 38.10 - samples/sec: 1662.22 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:00:59,813 epoch 1 - iter 462/773 - loss 0.50112244 - time (sec): 45.49 - samples/sec: 1674.06 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:01:07,113 epoch 1 - iter 539/773 - loss 0.44948369 - time (sec): 52.79 - samples/sec: 1657.75 - lr: 0.000035 - momentum: 0.000000 2023-10-17 11:01:14,963 epoch 1 - iter 616/773 - loss 0.40797204 - time (sec): 60.64 - samples/sec: 1645.26 - lr: 0.000040 - momentum: 0.000000 2023-10-17 11:01:22,928 epoch 1 - iter 693/773 - loss 0.37438356 - time (sec): 68.61 - samples/sec: 1637.73 - lr: 0.000045 - momentum: 0.000000 2023-10-17 11:01:30,630 epoch 1 - iter 770/773 - loss 0.34972134 - time (sec): 76.31 - samples/sec: 1623.90 - lr: 0.000050 - momentum: 0.000000 2023-10-17 11:01:30,909 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:01:30,910 EPOCH 1 done: loss 0.3490 - lr: 0.000050 2023-10-17 11:01:33,275 DEV : loss 0.05666542798280716 - f1-score (micro avg) 0.7588 2023-10-17 11:01:33,307 saving best model 2023-10-17 11:01:33,850 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:01:41,159 epoch 2 - iter 77/773 - loss 0.10595151 - time (sec): 7.31 - samples/sec: 1665.95 - lr: 0.000049 - momentum: 0.000000 2023-10-17 11:01:48,833 epoch 2 - iter 154/773 - loss 0.09841306 - time (sec): 14.98 - samples/sec: 1681.95 - lr: 0.000049 - momentum: 0.000000 2023-10-17 11:01:56,463 epoch 2 - iter 231/773 - loss 0.09574344 - time (sec): 22.61 - samples/sec: 1633.46 - lr: 0.000048 - momentum: 0.000000 2023-10-17 11:02:04,412 epoch 2 - iter 308/773 - loss 0.09169926 - time (sec): 30.56 - samples/sec: 1613.33 - lr: 0.000048 - momentum: 0.000000 2023-10-17 11:02:12,560 epoch 2 - iter 385/773 - loss 0.08863266 - time (sec): 38.71 - samples/sec: 1583.60 - lr: 0.000047 - momentum: 0.000000 2023-10-17 11:02:20,339 epoch 2 - iter 462/773 - loss 0.08698644 - time (sec): 46.49 - samples/sec: 1577.22 - lr: 0.000047 - momentum: 0.000000 2023-10-17 11:02:28,249 epoch 2 - iter 539/773 - loss 0.08349280 - time (sec): 54.40 - samples/sec: 1581.51 - lr: 0.000046 - momentum: 0.000000 2023-10-17 11:02:35,216 epoch 2 - iter 616/773 - loss 0.08431562 - time (sec): 61.36 - samples/sec: 1604.20 - lr: 0.000046 - momentum: 0.000000 2023-10-17 11:02:42,366 epoch 2 - iter 693/773 - loss 0.08313786 - time (sec): 68.51 - samples/sec: 1621.52 - lr: 0.000045 - momentum: 0.000000 2023-10-17 11:02:50,101 epoch 2 - iter 770/773 - loss 0.08075274 - time (sec): 76.25 - samples/sec: 1625.77 - lr: 0.000044 - momentum: 0.000000 2023-10-17 11:02:50,377 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:02:50,377 EPOCH 2 done: loss 0.0807 - lr: 0.000044 2023-10-17 11:02:54,039 DEV : loss 0.05350477248430252 - f1-score (micro avg) 0.7425 2023-10-17 11:02:54,072 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:03:01,603 epoch 3 - iter 77/773 - loss 0.05640998 - time (sec): 7.53 - samples/sec: 1560.20 - lr: 0.000044 - momentum: 0.000000 2023-10-17 11:03:09,509 epoch 3 - iter 154/773 - loss 0.05629241 - time (sec): 15.43 - samples/sec: 1659.98 - lr: 0.000043 - momentum: 0.000000 2023-10-17 11:03:17,025 epoch 3 - iter 231/773 - loss 0.05247215 - time (sec): 22.95 - samples/sec: 1689.23 - lr: 0.000043 - momentum: 0.000000 2023-10-17 11:03:23,846 epoch 3 - iter 308/773 - loss 0.05112877 - time (sec): 29.77 - samples/sec: 1690.16 - lr: 0.000042 - momentum: 0.000000 2023-10-17 11:03:30,759 epoch 3 - iter 385/773 - loss 0.05111415 - time (sec): 36.68 - samples/sec: 1697.84 - lr: 0.000042 - momentum: 0.000000 2023-10-17 11:03:37,744 epoch 3 - iter 462/773 - loss 0.05066311 - time (sec): 43.67 - samples/sec: 1718.06 - lr: 0.000041 - momentum: 0.000000 2023-10-17 11:03:44,762 epoch 3 - iter 539/773 - loss 0.04970420 - time (sec): 50.69 - samples/sec: 1718.08 - lr: 0.000041 - momentum: 0.000000 2023-10-17 11:03:52,179 epoch 3 - iter 616/773 - loss 0.05056488 - time (sec): 58.10 - samples/sec: 1711.20 - lr: 0.000040 - momentum: 0.000000 2023-10-17 11:04:00,191 epoch 3 - iter 693/773 - loss 0.04984793 - time (sec): 66.12 - samples/sec: 1698.05 - lr: 0.000039 - momentum: 0.000000 2023-10-17 11:04:07,775 epoch 3 - iter 770/773 - loss 0.04912347 - time (sec): 73.70 - samples/sec: 1681.26 - lr: 0.000039 - momentum: 0.000000 2023-10-17 11:04:08,068 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:08,069 EPOCH 3 done: loss 0.0490 - lr: 0.000039 2023-10-17 11:04:11,130 DEV : loss 0.07729422301054001 - f1-score (micro avg) 0.7535 2023-10-17 11:04:11,162 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:04:18,431 epoch 4 - iter 77/773 - loss 0.04152581 - time (sec): 7.27 - samples/sec: 1784.91 - lr: 0.000038 - momentum: 0.000000 2023-10-17 11:04:25,836 epoch 4 - iter 154/773 - loss 0.04153601 - time (sec): 14.67 - samples/sec: 1725.78 - lr: 0.000038 - momentum: 0.000000 2023-10-17 11:04:33,460 epoch 4 - iter 231/773 - loss 0.03745722 - time (sec): 22.30 - samples/sec: 1670.11 - lr: 0.000037 - momentum: 0.000000 2023-10-17 11:04:40,594 epoch 4 - iter 308/773 - loss 0.03833193 - time (sec): 29.43 - samples/sec: 1674.11 - lr: 0.000037 - momentum: 0.000000 2023-10-17 11:04:48,298 epoch 4 - iter 385/773 - loss 0.03751315 - time (sec): 37.13 - samples/sec: 1673.41 - lr: 0.000036 - momentum: 0.000000 2023-10-17 11:04:55,464 epoch 4 - iter 462/773 - loss 0.04103089 - time (sec): 44.30 - samples/sec: 1679.69 - lr: 0.000036 - momentum: 0.000000 2023-10-17 11:05:02,532 epoch 4 - iter 539/773 - loss 0.04171505 - time (sec): 51.37 - samples/sec: 1686.35 - lr: 0.000035 - momentum: 0.000000 2023-10-17 11:05:09,597 epoch 4 - iter 616/773 - loss 0.04053471 - time (sec): 58.43 - samples/sec: 1704.07 - lr: 0.000034 - momentum: 0.000000 2023-10-17 11:05:16,908 epoch 4 - iter 693/773 - loss 0.04085799 - time (sec): 65.74 - samples/sec: 1703.30 - lr: 0.000034 - momentum: 0.000000 2023-10-17 11:05:23,845 epoch 4 - iter 770/773 - loss 0.03858981 - time (sec): 72.68 - samples/sec: 1703.53 - lr: 0.000033 - momentum: 0.000000 2023-10-17 11:05:24,115 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:24,115 EPOCH 4 done: loss 0.0386 - lr: 0.000033 2023-10-17 11:05:27,335 DEV : loss 0.10291995108127594 - f1-score (micro avg) 0.75 2023-10-17 11:05:27,368 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:05:34,468 epoch 5 - iter 77/773 - loss 0.02121432 - time (sec): 7.10 - samples/sec: 1722.34 - lr: 0.000033 - momentum: 0.000000 2023-10-17 11:05:41,809 epoch 5 - iter 154/773 - loss 0.02173581 - time (sec): 14.44 - samples/sec: 1656.77 - lr: 0.000032 - momentum: 0.000000 2023-10-17 11:05:49,386 epoch 5 - iter 231/773 - loss 0.02284222 - time (sec): 22.02 - samples/sec: 1623.47 - lr: 0.000032 - momentum: 0.000000 2023-10-17 11:05:57,449 epoch 5 - iter 308/773 - loss 0.02224722 - time (sec): 30.08 - samples/sec: 1599.93 - lr: 0.000031 - momentum: 0.000000 2023-10-17 11:06:05,538 epoch 5 - iter 385/773 - loss 0.02315292 - time (sec): 38.17 - samples/sec: 1584.64 - lr: 0.000031 - momentum: 0.000000 2023-10-17 11:06:13,313 epoch 5 - iter 462/773 - loss 0.02280264 - time (sec): 45.94 - samples/sec: 1586.27 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:06:21,346 epoch 5 - iter 539/773 - loss 0.02222341 - time (sec): 53.98 - samples/sec: 1605.57 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:06:29,147 epoch 5 - iter 616/773 - loss 0.02225636 - time (sec): 61.78 - samples/sec: 1604.81 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:06:36,844 epoch 5 - iter 693/773 - loss 0.02247422 - time (sec): 69.47 - samples/sec: 1598.18 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:06:44,690 epoch 5 - iter 770/773 - loss 0.02385905 - time (sec): 77.32 - samples/sec: 1601.97 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:06:44,982 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:06:44,983 EPOCH 5 done: loss 0.0239 - lr: 0.000028 2023-10-17 11:06:47,973 DEV : loss 0.10200614482164383 - f1-score (micro avg) 0.7755 2023-10-17 11:06:48,006 saving best model 2023-10-17 11:06:49,423 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:06:57,017 epoch 6 - iter 77/773 - loss 0.02265088 - time (sec): 7.59 - samples/sec: 1636.48 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:07:04,370 epoch 6 - iter 154/773 - loss 0.01754504 - time (sec): 14.94 - samples/sec: 1670.67 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:07:11,949 epoch 6 - iter 231/773 - loss 0.02091300 - time (sec): 22.52 - samples/sec: 1669.32 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:07:19,655 epoch 6 - iter 308/773 - loss 0.02004128 - time (sec): 30.23 - samples/sec: 1659.22 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:07:27,169 epoch 6 - iter 385/773 - loss 0.01990145 - time (sec): 37.74 - samples/sec: 1647.46 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:07:34,917 epoch 6 - iter 462/773 - loss 0.01815685 - time (sec): 45.49 - samples/sec: 1638.47 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:07:42,565 epoch 6 - iter 539/773 - loss 0.01761772 - time (sec): 53.14 - samples/sec: 1653.33 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:07:50,486 epoch 6 - iter 616/773 - loss 0.01780920 - time (sec): 61.06 - samples/sec: 1629.64 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:07:58,505 epoch 6 - iter 693/773 - loss 0.01725283 - time (sec): 69.08 - samples/sec: 1615.06 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:08:06,220 epoch 6 - iter 770/773 - loss 0.01731812 - time (sec): 76.79 - samples/sec: 1612.45 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:08:06,512 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:08:06,513 EPOCH 6 done: loss 0.0175 - lr: 0.000022 2023-10-17 11:08:09,372 DEV : loss 0.1064695492386818 - f1-score (micro avg) 0.7812 2023-10-17 11:08:09,402 saving best model 2023-10-17 11:08:10,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:08:19,013 epoch 7 - iter 77/773 - loss 0.00823321 - time (sec): 8.18 - samples/sec: 1534.57 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:08:27,207 epoch 7 - iter 154/773 - loss 0.00771650 - time (sec): 16.38 - samples/sec: 1596.41 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:08:34,794 epoch 7 - iter 231/773 - loss 0.00743188 - time (sec): 23.96 - samples/sec: 1601.35 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:08:42,358 epoch 7 - iter 308/773 - loss 0.00932527 - time (sec): 31.53 - samples/sec: 1601.51 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:08:49,801 epoch 7 - iter 385/773 - loss 0.01044298 - time (sec): 38.97 - samples/sec: 1607.78 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:08:57,085 epoch 7 - iter 462/773 - loss 0.01033137 - time (sec): 46.25 - samples/sec: 1603.80 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:09:04,079 epoch 7 - iter 539/773 - loss 0.01077934 - time (sec): 53.25 - samples/sec: 1616.94 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:09:10,970 epoch 7 - iter 616/773 - loss 0.01030466 - time (sec): 60.14 - samples/sec: 1644.61 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:09:18,516 epoch 7 - iter 693/773 - loss 0.01069559 - time (sec): 67.68 - samples/sec: 1651.42 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:09:26,294 epoch 7 - iter 770/773 - loss 0.01086256 - time (sec): 75.46 - samples/sec: 1642.31 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:09:26,580 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:09:26,580 EPOCH 7 done: loss 0.0109 - lr: 0.000017 2023-10-17 11:09:29,687 DEV : loss 0.1156071275472641 - f1-score (micro avg) 0.795 2023-10-17 11:09:29,716 saving best model 2023-10-17 11:09:30,324 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:09:37,985 epoch 8 - iter 77/773 - loss 0.00730225 - time (sec): 7.66 - samples/sec: 1614.48 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:09:45,493 epoch 8 - iter 154/773 - loss 0.00623081 - time (sec): 15.17 - samples/sec: 1667.90 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:09:52,733 epoch 8 - iter 231/773 - loss 0.00556449 - time (sec): 22.41 - samples/sec: 1656.66 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:10:00,021 epoch 8 - iter 308/773 - loss 0.00601470 - time (sec): 29.69 - samples/sec: 1667.09 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:10:07,775 epoch 8 - iter 385/773 - loss 0.00628428 - time (sec): 37.45 - samples/sec: 1664.34 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:10:14,899 epoch 8 - iter 462/773 - loss 0.00605850 - time (sec): 44.57 - samples/sec: 1683.47 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:10:22,424 epoch 8 - iter 539/773 - loss 0.00700050 - time (sec): 52.10 - samples/sec: 1672.76 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:10:29,379 epoch 8 - iter 616/773 - loss 0.00759703 - time (sec): 59.05 - samples/sec: 1679.44 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:10:36,671 epoch 8 - iter 693/773 - loss 0.00788866 - time (sec): 66.35 - samples/sec: 1683.27 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:10:43,865 epoch 8 - iter 770/773 - loss 0.00753285 - time (sec): 73.54 - samples/sec: 1682.46 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:10:44,140 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:10:44,141 EPOCH 8 done: loss 0.0076 - lr: 0.000011 2023-10-17 11:10:47,005 DEV : loss 0.12029711902141571 - f1-score (micro avg) 0.7907 2023-10-17 11:10:47,036 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:10:55,094 epoch 9 - iter 77/773 - loss 0.00421926 - time (sec): 8.06 - samples/sec: 1520.59 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:11:02,690 epoch 9 - iter 154/773 - loss 0.00644406 - time (sec): 15.65 - samples/sec: 1544.60 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:11:10,046 epoch 9 - iter 231/773 - loss 0.00582051 - time (sec): 23.01 - samples/sec: 1641.13 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:11:17,664 epoch 9 - iter 308/773 - loss 0.00577635 - time (sec): 30.63 - samples/sec: 1616.34 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:11:24,755 epoch 9 - iter 385/773 - loss 0.00505832 - time (sec): 37.72 - samples/sec: 1630.49 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:11:31,696 epoch 9 - iter 462/773 - loss 0.00461600 - time (sec): 44.66 - samples/sec: 1643.56 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:11:38,576 epoch 9 - iter 539/773 - loss 0.00451370 - time (sec): 51.54 - samples/sec: 1659.27 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:11:45,934 epoch 9 - iter 616/773 - loss 0.00431455 - time (sec): 58.90 - samples/sec: 1685.41 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:11:53,167 epoch 9 - iter 693/773 - loss 0.00441306 - time (sec): 66.13 - samples/sec: 1680.31 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:12:00,970 epoch 9 - iter 770/773 - loss 0.00435539 - time (sec): 73.93 - samples/sec: 1677.04 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:12:01,228 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:12:01,228 EPOCH 9 done: loss 0.0044 - lr: 0.000006 2023-10-17 11:12:04,174 DEV : loss 0.12349887937307358 - f1-score (micro avg) 0.7844 2023-10-17 11:12:04,207 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:12:11,747 epoch 10 - iter 77/773 - loss 0.00455782 - time (sec): 7.54 - samples/sec: 1732.92 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:12:19,108 epoch 10 - iter 154/773 - loss 0.00420301 - time (sec): 14.90 - samples/sec: 1648.39 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:12:26,608 epoch 10 - iter 231/773 - loss 0.00367766 - time (sec): 22.40 - samples/sec: 1622.02 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:12:33,675 epoch 10 - iter 308/773 - loss 0.00347476 - time (sec): 29.47 - samples/sec: 1663.99 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:12:41,045 epoch 10 - iter 385/773 - loss 0.00326384 - time (sec): 36.84 - samples/sec: 1660.68 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:12:49,048 epoch 10 - iter 462/773 - loss 0.00312465 - time (sec): 44.84 - samples/sec: 1633.05 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:12:57,102 epoch 10 - iter 539/773 - loss 0.00324546 - time (sec): 52.89 - samples/sec: 1628.44 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:13:05,165 epoch 10 - iter 616/773 - loss 0.00279880 - time (sec): 60.96 - samples/sec: 1644.46 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:13:12,609 epoch 10 - iter 693/773 - loss 0.00267431 - time (sec): 68.40 - samples/sec: 1634.31 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:13:20,306 epoch 10 - iter 770/773 - loss 0.00286368 - time (sec): 76.10 - samples/sec: 1628.24 - lr: 0.000000 - momentum: 0.000000 2023-10-17 11:13:20,580 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:20,581 EPOCH 10 done: loss 0.0029 - lr: 0.000000 2023-10-17 11:13:23,848 DEV : loss 0.12833575904369354 - f1-score (micro avg) 0.791 2023-10-17 11:13:24,507 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:24,509 Loading model from best epoch ... 2023-10-17 11:13:27,115 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 11:13:36,470 Results: - F-score (micro) 0.8191 - F-score (macro) 0.7465 - Accuracy 0.7116 By class: precision recall f1-score support LOC 0.8524 0.8668 0.8595 946 BUILDING 0.6506 0.5838 0.6154 185 STREET 0.8478 0.6964 0.7647 56 micro avg 0.8237 0.8147 0.8191 1187 macro avg 0.7836 0.7157 0.7465 1187 weighted avg 0.8207 0.8147 0.8170 1187 2023-10-17 11:13:36,470 ----------------------------------------------------------------------------------------------------