2023-10-14 00:49:48,661 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,662 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 00:49:48,662 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 Train: 7936 sentences 2023-10-14 00:49:48,663 (train_with_dev=False, train_with_test=False) 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 Training Params: 2023-10-14 00:49:48,663 - learning_rate: "5e-05" 2023-10-14 00:49:48,663 - mini_batch_size: "4" 2023-10-14 00:49:48,663 - max_epochs: "10" 2023-10-14 00:49:48,663 - shuffle: "True" 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 Plugins: 2023-10-14 00:49:48,663 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 00:49:48,663 - metric: "('micro avg', 'f1-score')" 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 Computation: 2023-10-14 00:49:48,663 - compute on device: cuda:0 2023-10-14 00:49:48,663 - embedding storage: none 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:48,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:49:57,804 epoch 1 - iter 198/1984 - loss 1.53724749 - time (sec): 9.14 - samples/sec: 1689.62 - lr: 0.000005 - momentum: 0.000000 2023-10-14 00:50:06,820 epoch 1 - iter 396/1984 - loss 0.90847063 - time (sec): 18.16 - samples/sec: 1736.23 - lr: 0.000010 - momentum: 0.000000 2023-10-14 00:50:15,961 epoch 1 - iter 594/1984 - loss 0.66910730 - time (sec): 27.30 - samples/sec: 1762.35 - lr: 0.000015 - momentum: 0.000000 2023-10-14 00:50:25,101 epoch 1 - iter 792/1984 - loss 0.54194526 - time (sec): 36.44 - samples/sec: 1766.51 - lr: 0.000020 - momentum: 0.000000 2023-10-14 00:50:34,298 epoch 1 - iter 990/1984 - loss 0.46568736 - time (sec): 45.63 - samples/sec: 1771.15 - lr: 0.000025 - momentum: 0.000000 2023-10-14 00:50:43,508 epoch 1 - iter 1188/1984 - loss 0.40842958 - time (sec): 54.84 - samples/sec: 1784.44 - lr: 0.000030 - momentum: 0.000000 2023-10-14 00:50:52,633 epoch 1 - iter 1386/1984 - loss 0.37119435 - time (sec): 63.97 - samples/sec: 1778.05 - lr: 0.000035 - momentum: 0.000000 2023-10-14 00:51:01,659 epoch 1 - iter 1584/1984 - loss 0.34086154 - time (sec): 72.99 - samples/sec: 1781.84 - lr: 0.000040 - momentum: 0.000000 2023-10-14 00:51:10,631 epoch 1 - iter 1782/1984 - loss 0.31872014 - time (sec): 81.97 - samples/sec: 1784.62 - lr: 0.000045 - momentum: 0.000000 2023-10-14 00:51:20,282 epoch 1 - iter 1980/1984 - loss 0.30178455 - time (sec): 91.62 - samples/sec: 1783.56 - lr: 0.000050 - momentum: 0.000000 2023-10-14 00:51:20,520 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:51:20,520 EPOCH 1 done: loss 0.3012 - lr: 0.000050 2023-10-14 00:51:23,582 DEV : loss 0.1133425310254097 - f1-score (micro avg) 0.6583 2023-10-14 00:51:23,603 saving best model 2023-10-14 00:51:23,997 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:51:33,078 epoch 2 - iter 198/1984 - loss 0.13897226 - time (sec): 9.08 - samples/sec: 1674.90 - lr: 0.000049 - momentum: 0.000000 2023-10-14 00:51:42,040 epoch 2 - iter 396/1984 - loss 0.12542063 - time (sec): 18.04 - samples/sec: 1729.05 - lr: 0.000049 - momentum: 0.000000 2023-10-14 00:51:50,949 epoch 2 - iter 594/1984 - loss 0.12478130 - time (sec): 26.95 - samples/sec: 1748.86 - lr: 0.000048 - momentum: 0.000000 2023-10-14 00:52:00,060 epoch 2 - iter 792/1984 - loss 0.11920470 - time (sec): 36.06 - samples/sec: 1765.09 - lr: 0.000048 - momentum: 0.000000 2023-10-14 00:52:09,012 epoch 2 - iter 990/1984 - loss 0.11984526 - time (sec): 45.01 - samples/sec: 1795.84 - lr: 0.000047 - momentum: 0.000000 2023-10-14 00:52:18,122 epoch 2 - iter 1188/1984 - loss 0.11961263 - time (sec): 54.12 - samples/sec: 1805.77 - lr: 0.000047 - momentum: 0.000000 2023-10-14 00:52:27,096 epoch 2 - iter 1386/1984 - loss 0.11843545 - time (sec): 63.10 - samples/sec: 1810.41 - lr: 0.000046 - momentum: 0.000000 2023-10-14 00:52:36,079 epoch 2 - iter 1584/1984 - loss 0.11711417 - time (sec): 72.08 - samples/sec: 1806.73 - lr: 0.000046 - momentum: 0.000000 2023-10-14 00:52:45,332 epoch 2 - iter 1782/1984 - loss 0.11662742 - time (sec): 81.33 - samples/sec: 1807.98 - lr: 0.000045 - momentum: 0.000000 2023-10-14 00:52:54,577 epoch 2 - iter 1980/1984 - loss 0.11767432 - time (sec): 90.58 - samples/sec: 1805.11 - lr: 0.000044 - momentum: 0.000000 2023-10-14 00:52:54,777 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:52:54,778 EPOCH 2 done: loss 0.1176 - lr: 0.000044 2023-10-14 00:52:58,196 DEV : loss 0.11633753031492233 - f1-score (micro avg) 0.7391 2023-10-14 00:52:58,217 saving best model 2023-10-14 00:52:58,721 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:53:07,809 epoch 3 - iter 198/1984 - loss 0.07961185 - time (sec): 9.08 - samples/sec: 1662.08 - lr: 0.000044 - momentum: 0.000000 2023-10-14 00:53:16,610 epoch 3 - iter 396/1984 - loss 0.08656123 - time (sec): 17.89 - samples/sec: 1814.06 - lr: 0.000043 - momentum: 0.000000 2023-10-14 00:53:25,189 epoch 3 - iter 594/1984 - loss 0.08978808 - time (sec): 26.47 - samples/sec: 1809.36 - lr: 0.000043 - momentum: 0.000000 2023-10-14 00:53:33,894 epoch 3 - iter 792/1984 - loss 0.08926948 - time (sec): 35.17 - samples/sec: 1812.25 - lr: 0.000042 - momentum: 0.000000 2023-10-14 00:53:42,689 epoch 3 - iter 990/1984 - loss 0.08784584 - time (sec): 43.96 - samples/sec: 1852.59 - lr: 0.000042 - momentum: 0.000000 2023-10-14 00:53:51,395 epoch 3 - iter 1188/1984 - loss 0.08895092 - time (sec): 52.67 - samples/sec: 1857.35 - lr: 0.000041 - momentum: 0.000000 2023-10-14 00:54:00,102 epoch 3 - iter 1386/1984 - loss 0.08869961 - time (sec): 61.38 - samples/sec: 1867.22 - lr: 0.000041 - momentum: 0.000000 2023-10-14 00:54:09,347 epoch 3 - iter 1584/1984 - loss 0.09094737 - time (sec): 70.62 - samples/sec: 1860.38 - lr: 0.000040 - momentum: 0.000000 2023-10-14 00:54:17,952 epoch 3 - iter 1782/1984 - loss 0.09099563 - time (sec): 79.23 - samples/sec: 1855.10 - lr: 0.000039 - momentum: 0.000000 2023-10-14 00:54:26,871 epoch 3 - iter 1980/1984 - loss 0.09120135 - time (sec): 88.15 - samples/sec: 1856.11 - lr: 0.000039 - momentum: 0.000000 2023-10-14 00:54:27,053 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:54:27,054 EPOCH 3 done: loss 0.0910 - lr: 0.000039 2023-10-14 00:54:30,443 DEV : loss 0.1269545555114746 - f1-score (micro avg) 0.7242 2023-10-14 00:54:30,463 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:54:39,516 epoch 4 - iter 198/1984 - loss 0.05433866 - time (sec): 9.05 - samples/sec: 1935.46 - lr: 0.000038 - momentum: 0.000000 2023-10-14 00:54:48,580 epoch 4 - iter 396/1984 - loss 0.05884857 - time (sec): 18.12 - samples/sec: 1864.72 - lr: 0.000038 - momentum: 0.000000 2023-10-14 00:54:57,793 epoch 4 - iter 594/1984 - loss 0.06202748 - time (sec): 27.33 - samples/sec: 1833.00 - lr: 0.000037 - momentum: 0.000000 2023-10-14 00:55:06,811 epoch 4 - iter 792/1984 - loss 0.06408555 - time (sec): 36.35 - samples/sec: 1817.58 - lr: 0.000037 - momentum: 0.000000 2023-10-14 00:55:15,932 epoch 4 - iter 990/1984 - loss 0.06329006 - time (sec): 45.47 - samples/sec: 1819.63 - lr: 0.000036 - momentum: 0.000000 2023-10-14 00:55:24,967 epoch 4 - iter 1188/1984 - loss 0.06324019 - time (sec): 54.50 - samples/sec: 1811.97 - lr: 0.000036 - momentum: 0.000000 2023-10-14 00:55:33,967 epoch 4 - iter 1386/1984 - loss 0.06369240 - time (sec): 63.50 - samples/sec: 1800.01 - lr: 0.000035 - momentum: 0.000000 2023-10-14 00:55:42,856 epoch 4 - iter 1584/1984 - loss 0.06453190 - time (sec): 72.39 - samples/sec: 1802.53 - lr: 0.000034 - momentum: 0.000000 2023-10-14 00:55:51,827 epoch 4 - iter 1782/1984 - loss 0.06569458 - time (sec): 81.36 - samples/sec: 1796.61 - lr: 0.000034 - momentum: 0.000000 2023-10-14 00:56:01,014 epoch 4 - iter 1980/1984 - loss 0.06849618 - time (sec): 90.55 - samples/sec: 1807.37 - lr: 0.000033 - momentum: 0.000000 2023-10-14 00:56:01,216 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:56:01,216 EPOCH 4 done: loss 0.0684 - lr: 0.000033 2023-10-14 00:56:04,690 DEV : loss 0.14090722799301147 - f1-score (micro avg) 0.7191 2023-10-14 00:56:04,710 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:56:13,720 epoch 5 - iter 198/1984 - loss 0.04795978 - time (sec): 9.01 - samples/sec: 1837.13 - lr: 0.000033 - momentum: 0.000000 2023-10-14 00:56:22,926 epoch 5 - iter 396/1984 - loss 0.05324720 - time (sec): 18.21 - samples/sec: 1830.96 - lr: 0.000032 - momentum: 0.000000 2023-10-14 00:56:31,861 epoch 5 - iter 594/1984 - loss 0.05636139 - time (sec): 27.15 - samples/sec: 1815.80 - lr: 0.000032 - momentum: 0.000000 2023-10-14 00:56:40,804 epoch 5 - iter 792/1984 - loss 0.05252045 - time (sec): 36.09 - samples/sec: 1825.73 - lr: 0.000031 - momentum: 0.000000 2023-10-14 00:56:49,757 epoch 5 - iter 990/1984 - loss 0.05169474 - time (sec): 45.05 - samples/sec: 1827.51 - lr: 0.000031 - momentum: 0.000000 2023-10-14 00:56:58,664 epoch 5 - iter 1188/1984 - loss 0.05289705 - time (sec): 53.95 - samples/sec: 1832.82 - lr: 0.000030 - momentum: 0.000000 2023-10-14 00:57:07,546 epoch 5 - iter 1386/1984 - loss 0.05351570 - time (sec): 62.83 - samples/sec: 1817.54 - lr: 0.000029 - momentum: 0.000000 2023-10-14 00:57:16,606 epoch 5 - iter 1584/1984 - loss 0.05345907 - time (sec): 71.89 - samples/sec: 1825.17 - lr: 0.000029 - momentum: 0.000000 2023-10-14 00:57:25,725 epoch 5 - iter 1782/1984 - loss 0.05280644 - time (sec): 81.01 - samples/sec: 1817.67 - lr: 0.000028 - momentum: 0.000000 2023-10-14 00:57:34,688 epoch 5 - iter 1980/1984 - loss 0.05307208 - time (sec): 89.98 - samples/sec: 1819.40 - lr: 0.000028 - momentum: 0.000000 2023-10-14 00:57:34,864 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:57:34,864 EPOCH 5 done: loss 0.0530 - lr: 0.000028 2023-10-14 00:57:39,760 DEV : loss 0.16819310188293457 - f1-score (micro avg) 0.7497 2023-10-14 00:57:39,784 saving best model 2023-10-14 00:57:40,286 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:57:50,083 epoch 6 - iter 198/1984 - loss 0.03603590 - time (sec): 9.80 - samples/sec: 1769.03 - lr: 0.000027 - momentum: 0.000000 2023-10-14 00:57:59,416 epoch 6 - iter 396/1984 - loss 0.04328181 - time (sec): 19.13 - samples/sec: 1746.56 - lr: 0.000027 - momentum: 0.000000 2023-10-14 00:58:08,350 epoch 6 - iter 594/1984 - loss 0.04012567 - time (sec): 28.06 - samples/sec: 1753.94 - lr: 0.000026 - momentum: 0.000000 2023-10-14 00:58:17,321 epoch 6 - iter 792/1984 - loss 0.04096933 - time (sec): 37.03 - samples/sec: 1778.47 - lr: 0.000026 - momentum: 0.000000 2023-10-14 00:58:26,296 epoch 6 - iter 990/1984 - loss 0.03953279 - time (sec): 46.01 - samples/sec: 1792.37 - lr: 0.000025 - momentum: 0.000000 2023-10-14 00:58:35,124 epoch 6 - iter 1188/1984 - loss 0.03902963 - time (sec): 54.84 - samples/sec: 1794.64 - lr: 0.000024 - momentum: 0.000000 2023-10-14 00:58:44,185 epoch 6 - iter 1386/1984 - loss 0.04039913 - time (sec): 63.90 - samples/sec: 1795.19 - lr: 0.000024 - momentum: 0.000000 2023-10-14 00:58:53,214 epoch 6 - iter 1584/1984 - loss 0.03961834 - time (sec): 72.93 - samples/sec: 1797.77 - lr: 0.000023 - momentum: 0.000000 2023-10-14 00:59:02,350 epoch 6 - iter 1782/1984 - loss 0.03908488 - time (sec): 82.06 - samples/sec: 1804.50 - lr: 0.000023 - momentum: 0.000000 2023-10-14 00:59:11,391 epoch 6 - iter 1980/1984 - loss 0.03925037 - time (sec): 91.10 - samples/sec: 1796.88 - lr: 0.000022 - momentum: 0.000000 2023-10-14 00:59:11,570 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:59:11,570 EPOCH 6 done: loss 0.0392 - lr: 0.000022 2023-10-14 00:59:14,951 DEV : loss 0.17521554231643677 - f1-score (micro avg) 0.7618 2023-10-14 00:59:14,974 saving best model 2023-10-14 00:59:15,489 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:59:24,475 epoch 7 - iter 198/1984 - loss 0.02676968 - time (sec): 8.98 - samples/sec: 1799.42 - lr: 0.000022 - momentum: 0.000000 2023-10-14 00:59:33,835 epoch 7 - iter 396/1984 - loss 0.03142616 - time (sec): 18.34 - samples/sec: 1750.40 - lr: 0.000021 - momentum: 0.000000 2023-10-14 00:59:42,873 epoch 7 - iter 594/1984 - loss 0.02888195 - time (sec): 27.38 - samples/sec: 1792.40 - lr: 0.000021 - momentum: 0.000000 2023-10-14 00:59:51,844 epoch 7 - iter 792/1984 - loss 0.02988660 - time (sec): 36.35 - samples/sec: 1805.73 - lr: 0.000020 - momentum: 0.000000 2023-10-14 01:00:00,760 epoch 7 - iter 990/1984 - loss 0.03159975 - time (sec): 45.27 - samples/sec: 1803.07 - lr: 0.000019 - momentum: 0.000000 2023-10-14 01:00:09,886 epoch 7 - iter 1188/1984 - loss 0.03155178 - time (sec): 54.40 - samples/sec: 1807.29 - lr: 0.000019 - momentum: 0.000000 2023-10-14 01:00:19,081 epoch 7 - iter 1386/1984 - loss 0.03118184 - time (sec): 63.59 - samples/sec: 1808.97 - lr: 0.000018 - momentum: 0.000000 2023-10-14 01:00:28,165 epoch 7 - iter 1584/1984 - loss 0.03109553 - time (sec): 72.67 - samples/sec: 1804.28 - lr: 0.000018 - momentum: 0.000000 2023-10-14 01:00:37,267 epoch 7 - iter 1782/1984 - loss 0.03006272 - time (sec): 81.78 - samples/sec: 1803.23 - lr: 0.000017 - momentum: 0.000000 2023-10-14 01:00:46,318 epoch 7 - iter 1980/1984 - loss 0.02962727 - time (sec): 90.83 - samples/sec: 1801.81 - lr: 0.000017 - momentum: 0.000000 2023-10-14 01:00:46,499 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:00:46,499 EPOCH 7 done: loss 0.0298 - lr: 0.000017 2023-10-14 01:00:50,354 DEV : loss 0.18237848579883575 - f1-score (micro avg) 0.7609 2023-10-14 01:00:50,375 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:00:59,048 epoch 8 - iter 198/1984 - loss 0.01637163 - time (sec): 8.67 - samples/sec: 1981.13 - lr: 0.000016 - momentum: 0.000000 2023-10-14 01:01:07,691 epoch 8 - iter 396/1984 - loss 0.01553628 - time (sec): 17.32 - samples/sec: 1920.38 - lr: 0.000016 - momentum: 0.000000 2023-10-14 01:01:16,305 epoch 8 - iter 594/1984 - loss 0.01596650 - time (sec): 25.93 - samples/sec: 1883.00 - lr: 0.000015 - momentum: 0.000000 2023-10-14 01:01:25,141 epoch 8 - iter 792/1984 - loss 0.01803168 - time (sec): 34.76 - samples/sec: 1901.83 - lr: 0.000014 - momentum: 0.000000 2023-10-14 01:01:33,899 epoch 8 - iter 990/1984 - loss 0.01727232 - time (sec): 43.52 - samples/sec: 1907.35 - lr: 0.000014 - momentum: 0.000000 2023-10-14 01:01:42,671 epoch 8 - iter 1188/1984 - loss 0.02017091 - time (sec): 52.30 - samples/sec: 1910.58 - lr: 0.000013 - momentum: 0.000000 2023-10-14 01:01:51,523 epoch 8 - iter 1386/1984 - loss 0.01917780 - time (sec): 61.15 - samples/sec: 1899.31 - lr: 0.000013 - momentum: 0.000000 2023-10-14 01:02:00,393 epoch 8 - iter 1584/1984 - loss 0.01920417 - time (sec): 70.02 - samples/sec: 1883.84 - lr: 0.000012 - momentum: 0.000000 2023-10-14 01:02:09,373 epoch 8 - iter 1782/1984 - loss 0.01963716 - time (sec): 79.00 - samples/sec: 1872.22 - lr: 0.000012 - momentum: 0.000000 2023-10-14 01:02:18,308 epoch 8 - iter 1980/1984 - loss 0.01978786 - time (sec): 87.93 - samples/sec: 1862.43 - lr: 0.000011 - momentum: 0.000000 2023-10-14 01:02:18,486 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:02:18,486 EPOCH 8 done: loss 0.0198 - lr: 0.000011 2023-10-14 01:02:21,996 DEV : loss 0.1977926641702652 - f1-score (micro avg) 0.751 2023-10-14 01:02:22,017 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:02:31,041 epoch 9 - iter 198/1984 - loss 0.01112291 - time (sec): 9.02 - samples/sec: 1793.00 - lr: 0.000011 - momentum: 0.000000 2023-10-14 01:02:40,087 epoch 9 - iter 396/1984 - loss 0.01234062 - time (sec): 18.07 - samples/sec: 1838.45 - lr: 0.000010 - momentum: 0.000000 2023-10-14 01:02:49,315 epoch 9 - iter 594/1984 - loss 0.01313899 - time (sec): 27.30 - samples/sec: 1833.18 - lr: 0.000009 - momentum: 0.000000 2023-10-14 01:02:58,380 epoch 9 - iter 792/1984 - loss 0.01244334 - time (sec): 36.36 - samples/sec: 1805.87 - lr: 0.000009 - momentum: 0.000000 2023-10-14 01:03:07,338 epoch 9 - iter 990/1984 - loss 0.01299304 - time (sec): 45.32 - samples/sec: 1815.54 - lr: 0.000008 - momentum: 0.000000 2023-10-14 01:03:16,250 epoch 9 - iter 1188/1984 - loss 0.01368344 - time (sec): 54.23 - samples/sec: 1819.76 - lr: 0.000008 - momentum: 0.000000 2023-10-14 01:03:25,416 epoch 9 - iter 1386/1984 - loss 0.01365118 - time (sec): 63.40 - samples/sec: 1810.13 - lr: 0.000007 - momentum: 0.000000 2023-10-14 01:03:34,466 epoch 9 - iter 1584/1984 - loss 0.01403258 - time (sec): 72.45 - samples/sec: 1818.63 - lr: 0.000007 - momentum: 0.000000 2023-10-14 01:03:43,362 epoch 9 - iter 1782/1984 - loss 0.01384713 - time (sec): 81.34 - samples/sec: 1814.98 - lr: 0.000006 - momentum: 0.000000 2023-10-14 01:03:52,293 epoch 9 - iter 1980/1984 - loss 0.01379251 - time (sec): 90.27 - samples/sec: 1811.60 - lr: 0.000006 - momentum: 0.000000 2023-10-14 01:03:52,482 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:03:52,482 EPOCH 9 done: loss 0.0138 - lr: 0.000006 2023-10-14 01:03:55,900 DEV : loss 0.21640685200691223 - f1-score (micro avg) 0.7655 2023-10-14 01:03:55,921 saving best model 2023-10-14 01:03:56,446 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:04:05,495 epoch 10 - iter 198/1984 - loss 0.01091250 - time (sec): 9.04 - samples/sec: 1942.89 - lr: 0.000005 - momentum: 0.000000 2023-10-14 01:04:14,530 epoch 10 - iter 396/1984 - loss 0.01087071 - time (sec): 18.08 - samples/sec: 1885.48 - lr: 0.000004 - momentum: 0.000000 2023-10-14 01:04:23,401 epoch 10 - iter 594/1984 - loss 0.00990636 - time (sec): 26.95 - samples/sec: 1824.67 - lr: 0.000004 - momentum: 0.000000 2023-10-14 01:04:32,360 epoch 10 - iter 792/1984 - loss 0.00993432 - time (sec): 35.91 - samples/sec: 1831.69 - lr: 0.000003 - momentum: 0.000000 2023-10-14 01:04:41,292 epoch 10 - iter 990/1984 - loss 0.00944508 - time (sec): 44.84 - samples/sec: 1836.42 - lr: 0.000003 - momentum: 0.000000 2023-10-14 01:04:50,258 epoch 10 - iter 1188/1984 - loss 0.00925638 - time (sec): 53.81 - samples/sec: 1828.55 - lr: 0.000002 - momentum: 0.000000 2023-10-14 01:04:59,457 epoch 10 - iter 1386/1984 - loss 0.00937031 - time (sec): 63.01 - samples/sec: 1819.31 - lr: 0.000002 - momentum: 0.000000 2023-10-14 01:05:08,481 epoch 10 - iter 1584/1984 - loss 0.00966340 - time (sec): 72.03 - samples/sec: 1822.76 - lr: 0.000001 - momentum: 0.000000 2023-10-14 01:05:17,395 epoch 10 - iter 1782/1984 - loss 0.00945089 - time (sec): 80.94 - samples/sec: 1826.81 - lr: 0.000001 - momentum: 0.000000 2023-10-14 01:05:26,338 epoch 10 - iter 1980/1984 - loss 0.01001546 - time (sec): 89.89 - samples/sec: 1821.03 - lr: 0.000000 - momentum: 0.000000 2023-10-14 01:05:26,534 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:05:26,534 EPOCH 10 done: loss 0.0100 - lr: 0.000000 2023-10-14 01:05:30,455 DEV : loss 0.21858979761600494 - f1-score (micro avg) 0.7704 2023-10-14 01:05:30,477 saving best model 2023-10-14 01:05:31,356 ---------------------------------------------------------------------------------------------------- 2023-10-14 01:05:31,357 Loading model from best epoch ... 2023-10-14 01:05:32,722 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 01:05:36,001 Results: - F-score (micro) 0.7851 - F-score (macro) 0.7065 - Accuracy 0.6694 By class: precision recall f1-score support LOC 0.8153 0.8626 0.8383 655 PER 0.7236 0.7982 0.7591 223 ORG 0.5960 0.4646 0.5221 127 micro avg 0.7726 0.7980 0.7851 1005 macro avg 0.7116 0.7085 0.7065 1005 weighted avg 0.7672 0.7980 0.7807 1005 2023-10-14 01:05:36,001 ----------------------------------------------------------------------------------------------------