2023-10-13 23:55:15,562 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,563 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 23:55:15,563 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 Train: 7936 sentences 2023-10-13 23:55:15,564 (train_with_dev=False, train_with_test=False) 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 Training Params: 2023-10-13 23:55:15,564 - learning_rate: "5e-05" 2023-10-13 23:55:15,564 - mini_batch_size: "4" 2023-10-13 23:55:15,564 - max_epochs: "10" 2023-10-13 23:55:15,564 - shuffle: "True" 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 Plugins: 2023-10-13 23:55:15,564 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 23:55:15,564 - metric: "('micro avg', 'f1-score')" 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 Computation: 2023-10-13 23:55:15,564 - compute on device: cuda:0 2023-10-13 23:55:15,564 - embedding storage: none 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:55:24,306 epoch 1 - iter 198/1984 - loss 1.50772186 - time (sec): 8.74 - samples/sec: 1863.43 - lr: 0.000005 - momentum: 0.000000 2023-10-13 23:55:32,982 epoch 1 - iter 396/1984 - loss 0.90184360 - time (sec): 17.42 - samples/sec: 1859.29 - lr: 0.000010 - momentum: 0.000000 2023-10-13 23:55:41,617 epoch 1 - iter 594/1984 - loss 0.67583760 - time (sec): 26.05 - samples/sec: 1849.23 - lr: 0.000015 - momentum: 0.000000 2023-10-13 23:55:50,732 epoch 1 - iter 792/1984 - loss 0.55353196 - time (sec): 35.17 - samples/sec: 1835.49 - lr: 0.000020 - momentum: 0.000000 2023-10-13 23:55:59,824 epoch 1 - iter 990/1984 - loss 0.47277969 - time (sec): 44.26 - samples/sec: 1838.05 - lr: 0.000025 - momentum: 0.000000 2023-10-13 23:56:08,934 epoch 1 - iter 1188/1984 - loss 0.41000215 - time (sec): 53.37 - samples/sec: 1862.91 - lr: 0.000030 - momentum: 0.000000 2023-10-13 23:56:17,850 epoch 1 - iter 1386/1984 - loss 0.37433987 - time (sec): 62.28 - samples/sec: 1853.85 - lr: 0.000035 - momentum: 0.000000 2023-10-13 23:56:27,010 epoch 1 - iter 1584/1984 - loss 0.34567962 - time (sec): 71.45 - samples/sec: 1845.25 - lr: 0.000040 - momentum: 0.000000 2023-10-13 23:56:36,022 epoch 1 - iter 1782/1984 - loss 0.32457592 - time (sec): 80.46 - samples/sec: 1832.92 - lr: 0.000045 - momentum: 0.000000 2023-10-13 23:56:44,992 epoch 1 - iter 1980/1984 - loss 0.30739042 - time (sec): 89.43 - samples/sec: 1829.76 - lr: 0.000050 - momentum: 0.000000 2023-10-13 23:56:45,174 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:56:45,174 EPOCH 1 done: loss 0.3074 - lr: 0.000050 2023-10-13 23:56:48,815 DEV : loss 0.1368168443441391 - f1-score (micro avg) 0.6857 2023-10-13 23:56:48,836 saving best model 2023-10-13 23:56:49,275 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:56:58,249 epoch 2 - iter 198/1984 - loss 0.12279295 - time (sec): 8.97 - samples/sec: 1784.86 - lr: 0.000049 - momentum: 0.000000 2023-10-13 23:57:07,264 epoch 2 - iter 396/1984 - loss 0.11883519 - time (sec): 17.99 - samples/sec: 1808.64 - lr: 0.000049 - momentum: 0.000000 2023-10-13 23:57:16,299 epoch 2 - iter 594/1984 - loss 0.12767893 - time (sec): 27.02 - samples/sec: 1811.49 - lr: 0.000048 - momentum: 0.000000 2023-10-13 23:57:25,354 epoch 2 - iter 792/1984 - loss 0.12653949 - time (sec): 36.08 - samples/sec: 1815.08 - lr: 0.000048 - momentum: 0.000000 2023-10-13 23:57:34,318 epoch 2 - iter 990/1984 - loss 0.12528597 - time (sec): 45.04 - samples/sec: 1818.90 - lr: 0.000047 - momentum: 0.000000 2023-10-13 23:57:43,288 epoch 2 - iter 1188/1984 - loss 0.12318635 - time (sec): 54.01 - samples/sec: 1823.91 - lr: 0.000047 - momentum: 0.000000 2023-10-13 23:57:52,593 epoch 2 - iter 1386/1984 - loss 0.12351316 - time (sec): 63.32 - samples/sec: 1813.04 - lr: 0.000046 - momentum: 0.000000 2023-10-13 23:58:01,570 epoch 2 - iter 1584/1984 - loss 0.12274190 - time (sec): 72.29 - samples/sec: 1811.65 - lr: 0.000046 - momentum: 0.000000 2023-10-13 23:58:10,839 epoch 2 - iter 1782/1984 - loss 0.11980283 - time (sec): 81.56 - samples/sec: 1809.58 - lr: 0.000045 - momentum: 0.000000 2023-10-13 23:58:19,789 epoch 2 - iter 1980/1984 - loss 0.11769240 - time (sec): 90.51 - samples/sec: 1808.94 - lr: 0.000044 - momentum: 0.000000 2023-10-13 23:58:19,967 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:58:19,967 EPOCH 2 done: loss 0.1178 - lr: 0.000044 2023-10-13 23:58:23,421 DEV : loss 0.1154676303267479 - f1-score (micro avg) 0.7134 2023-10-13 23:58:23,442 saving best model 2023-10-13 23:58:23,980 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:58:33,121 epoch 3 - iter 198/1984 - loss 0.08197092 - time (sec): 9.14 - samples/sec: 1751.59 - lr: 0.000044 - momentum: 0.000000 2023-10-13 23:58:41,994 epoch 3 - iter 396/1984 - loss 0.08662040 - time (sec): 18.01 - samples/sec: 1770.73 - lr: 0.000043 - momentum: 0.000000 2023-10-13 23:58:51,072 epoch 3 - iter 594/1984 - loss 0.08983954 - time (sec): 27.09 - samples/sec: 1801.95 - lr: 0.000043 - momentum: 0.000000 2023-10-13 23:59:00,163 epoch 3 - iter 792/1984 - loss 0.08922528 - time (sec): 36.18 - samples/sec: 1823.58 - lr: 0.000042 - momentum: 0.000000 2023-10-13 23:59:09,201 epoch 3 - iter 990/1984 - loss 0.09021301 - time (sec): 45.22 - samples/sec: 1817.72 - lr: 0.000042 - momentum: 0.000000 2023-10-13 23:59:18,138 epoch 3 - iter 1188/1984 - loss 0.09283341 - time (sec): 54.15 - samples/sec: 1810.17 - lr: 0.000041 - momentum: 0.000000 2023-10-13 23:59:27,086 epoch 3 - iter 1386/1984 - loss 0.09059581 - time (sec): 63.10 - samples/sec: 1810.95 - lr: 0.000041 - momentum: 0.000000 2023-10-13 23:59:36,138 epoch 3 - iter 1584/1984 - loss 0.08879431 - time (sec): 72.15 - samples/sec: 1817.41 - lr: 0.000040 - momentum: 0.000000 2023-10-13 23:59:44,813 epoch 3 - iter 1782/1984 - loss 0.08740617 - time (sec): 80.83 - samples/sec: 1824.42 - lr: 0.000039 - momentum: 0.000000 2023-10-13 23:59:53,461 epoch 3 - iter 1980/1984 - loss 0.08760937 - time (sec): 89.48 - samples/sec: 1830.63 - lr: 0.000039 - momentum: 0.000000 2023-10-13 23:59:53,633 ---------------------------------------------------------------------------------------------------- 2023-10-13 23:59:53,633 EPOCH 3 done: loss 0.0877 - lr: 0.000039 2023-10-13 23:59:57,488 DEV : loss 0.11892345547676086 - f1-score (micro avg) 0.7438 2023-10-13 23:59:57,509 saving best model 2023-10-13 23:59:58,051 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:00:07,100 epoch 4 - iter 198/1984 - loss 0.06258502 - time (sec): 9.05 - samples/sec: 1759.68 - lr: 0.000038 - momentum: 0.000000 2023-10-14 00:00:16,056 epoch 4 - iter 396/1984 - loss 0.06650697 - time (sec): 18.00 - samples/sec: 1818.05 - lr: 0.000038 - momentum: 0.000000 2023-10-14 00:00:24,941 epoch 4 - iter 594/1984 - loss 0.06754876 - time (sec): 26.89 - samples/sec: 1775.71 - lr: 0.000037 - momentum: 0.000000 2023-10-14 00:00:33,992 epoch 4 - iter 792/1984 - loss 0.06932127 - time (sec): 35.94 - samples/sec: 1791.07 - lr: 0.000037 - momentum: 0.000000 2023-10-14 00:00:42,974 epoch 4 - iter 990/1984 - loss 0.06932810 - time (sec): 44.92 - samples/sec: 1802.23 - lr: 0.000036 - momentum: 0.000000 2023-10-14 00:00:52,014 epoch 4 - iter 1188/1984 - loss 0.07083132 - time (sec): 53.96 - samples/sec: 1813.38 - lr: 0.000036 - momentum: 0.000000 2023-10-14 00:01:01,168 epoch 4 - iter 1386/1984 - loss 0.07101762 - time (sec): 63.11 - samples/sec: 1805.21 - lr: 0.000035 - momentum: 0.000000 2023-10-14 00:01:10,193 epoch 4 - iter 1584/1984 - loss 0.07066029 - time (sec): 72.14 - samples/sec: 1799.01 - lr: 0.000034 - momentum: 0.000000 2023-10-14 00:01:19,160 epoch 4 - iter 1782/1984 - loss 0.07056362 - time (sec): 81.11 - samples/sec: 1804.70 - lr: 0.000034 - momentum: 0.000000 2023-10-14 00:01:28,184 epoch 4 - iter 1980/1984 - loss 0.06978300 - time (sec): 90.13 - samples/sec: 1816.23 - lr: 0.000033 - momentum: 0.000000 2023-10-14 00:01:28,365 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:01:28,365 EPOCH 4 done: loss 0.0697 - lr: 0.000033 2023-10-14 00:01:31,842 DEV : loss 0.15688975155353546 - f1-score (micro avg) 0.7467 2023-10-14 00:01:31,863 saving best model 2023-10-14 00:01:32,417 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:01:41,585 epoch 5 - iter 198/1984 - loss 0.05068501 - time (sec): 9.16 - samples/sec: 1761.62 - lr: 0.000033 - momentum: 0.000000 2023-10-14 00:01:50,548 epoch 5 - iter 396/1984 - loss 0.05451821 - time (sec): 18.13 - samples/sec: 1811.75 - lr: 0.000032 - momentum: 0.000000 2023-10-14 00:01:59,521 epoch 5 - iter 594/1984 - loss 0.05001998 - time (sec): 27.10 - samples/sec: 1843.55 - lr: 0.000032 - momentum: 0.000000 2023-10-14 00:02:08,376 epoch 5 - iter 792/1984 - loss 0.04929968 - time (sec): 35.96 - samples/sec: 1826.24 - lr: 0.000031 - momentum: 0.000000 2023-10-14 00:02:17,340 epoch 5 - iter 990/1984 - loss 0.04965478 - time (sec): 44.92 - samples/sec: 1813.73 - lr: 0.000031 - momentum: 0.000000 2023-10-14 00:02:26,433 epoch 5 - iter 1188/1984 - loss 0.04955068 - time (sec): 54.01 - samples/sec: 1816.53 - lr: 0.000030 - momentum: 0.000000 2023-10-14 00:02:35,499 epoch 5 - iter 1386/1984 - loss 0.04994146 - time (sec): 63.08 - samples/sec: 1821.68 - lr: 0.000029 - momentum: 0.000000 2023-10-14 00:02:44,636 epoch 5 - iter 1584/1984 - loss 0.05157217 - time (sec): 72.21 - samples/sec: 1829.33 - lr: 0.000029 - momentum: 0.000000 2023-10-14 00:02:53,623 epoch 5 - iter 1782/1984 - loss 0.04967315 - time (sec): 81.20 - samples/sec: 1822.14 - lr: 0.000028 - momentum: 0.000000 2023-10-14 00:03:02,598 epoch 5 - iter 1980/1984 - loss 0.04984569 - time (sec): 90.18 - samples/sec: 1813.98 - lr: 0.000028 - momentum: 0.000000 2023-10-14 00:03:02,783 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:03:02,783 EPOCH 5 done: loss 0.0501 - lr: 0.000028 2023-10-14 00:03:06,243 DEV : loss 0.18403209745883942 - f1-score (micro avg) 0.7202 2023-10-14 00:03:06,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:03:15,457 epoch 6 - iter 198/1984 - loss 0.04584627 - time (sec): 9.19 - samples/sec: 1898.94 - lr: 0.000027 - momentum: 0.000000 2023-10-14 00:03:24,449 epoch 6 - iter 396/1984 - loss 0.04186945 - time (sec): 18.18 - samples/sec: 1829.36 - lr: 0.000027 - momentum: 0.000000 2023-10-14 00:03:33,393 epoch 6 - iter 594/1984 - loss 0.03941596 - time (sec): 27.13 - samples/sec: 1800.46 - lr: 0.000026 - momentum: 0.000000 2023-10-14 00:03:42,411 epoch 6 - iter 792/1984 - loss 0.03879688 - time (sec): 36.15 - samples/sec: 1808.18 - lr: 0.000026 - momentum: 0.000000 2023-10-14 00:03:51,359 epoch 6 - iter 990/1984 - loss 0.03828610 - time (sec): 45.09 - samples/sec: 1799.34 - lr: 0.000025 - momentum: 0.000000 2023-10-14 00:04:00,278 epoch 6 - iter 1188/1984 - loss 0.03781164 - time (sec): 54.01 - samples/sec: 1796.84 - lr: 0.000024 - momentum: 0.000000 2023-10-14 00:04:09,404 epoch 6 - iter 1386/1984 - loss 0.03790898 - time (sec): 63.14 - samples/sec: 1807.80 - lr: 0.000024 - momentum: 0.000000 2023-10-14 00:04:18,592 epoch 6 - iter 1584/1984 - loss 0.03778105 - time (sec): 72.33 - samples/sec: 1803.97 - lr: 0.000023 - momentum: 0.000000 2023-10-14 00:04:27,635 epoch 6 - iter 1782/1984 - loss 0.03857277 - time (sec): 81.37 - samples/sec: 1807.22 - lr: 0.000023 - momentum: 0.000000 2023-10-14 00:04:37,083 epoch 6 - iter 1980/1984 - loss 0.03900797 - time (sec): 90.82 - samples/sec: 1802.90 - lr: 0.000022 - momentum: 0.000000 2023-10-14 00:04:37,255 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:04:37,255 EPOCH 6 done: loss 0.0390 - lr: 0.000022 2023-10-14 00:04:40,657 DEV : loss 0.18327702581882477 - f1-score (micro avg) 0.7536 2023-10-14 00:04:40,678 saving best model 2023-10-14 00:04:41,211 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:04:50,351 epoch 7 - iter 198/1984 - loss 0.02501832 - time (sec): 9.14 - samples/sec: 1831.27 - lr: 0.000022 - momentum: 0.000000 2023-10-14 00:04:59,277 epoch 7 - iter 396/1984 - loss 0.02322471 - time (sec): 18.06 - samples/sec: 1834.84 - lr: 0.000021 - momentum: 0.000000 2023-10-14 00:05:08,302 epoch 7 - iter 594/1984 - loss 0.02153630 - time (sec): 27.09 - samples/sec: 1838.56 - lr: 0.000021 - momentum: 0.000000 2023-10-14 00:05:17,224 epoch 7 - iter 792/1984 - loss 0.02269734 - time (sec): 36.01 - samples/sec: 1807.75 - lr: 0.000020 - momentum: 0.000000 2023-10-14 00:05:26,210 epoch 7 - iter 990/1984 - loss 0.02477186 - time (sec): 44.99 - samples/sec: 1823.24 - lr: 0.000019 - momentum: 0.000000 2023-10-14 00:05:35,217 epoch 7 - iter 1188/1984 - loss 0.02497735 - time (sec): 54.00 - samples/sec: 1816.37 - lr: 0.000019 - momentum: 0.000000 2023-10-14 00:05:44,152 epoch 7 - iter 1386/1984 - loss 0.02400522 - time (sec): 62.94 - samples/sec: 1812.54 - lr: 0.000018 - momentum: 0.000000 2023-10-14 00:05:53,131 epoch 7 - iter 1584/1984 - loss 0.02529192 - time (sec): 71.91 - samples/sec: 1811.38 - lr: 0.000018 - momentum: 0.000000 2023-10-14 00:06:02,211 epoch 7 - iter 1782/1984 - loss 0.02578243 - time (sec): 80.99 - samples/sec: 1813.40 - lr: 0.000017 - momentum: 0.000000 2023-10-14 00:06:11,197 epoch 7 - iter 1980/1984 - loss 0.02724683 - time (sec): 89.98 - samples/sec: 1819.80 - lr: 0.000017 - momentum: 0.000000 2023-10-14 00:06:11,373 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:06:11,373 EPOCH 7 done: loss 0.0272 - lr: 0.000017 2023-10-14 00:06:14,886 DEV : loss 0.19809222221374512 - f1-score (micro avg) 0.7694 2023-10-14 00:06:14,908 saving best model 2023-10-14 00:06:15,452 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:06:24,684 epoch 8 - iter 198/1984 - loss 0.02503918 - time (sec): 9.23 - samples/sec: 1846.98 - lr: 0.000016 - momentum: 0.000000 2023-10-14 00:06:33,673 epoch 8 - iter 396/1984 - loss 0.02450807 - time (sec): 18.22 - samples/sec: 1830.88 - lr: 0.000016 - momentum: 0.000000 2023-10-14 00:06:42,985 epoch 8 - iter 594/1984 - loss 0.02228166 - time (sec): 27.53 - samples/sec: 1833.48 - lr: 0.000015 - momentum: 0.000000 2023-10-14 00:06:51,897 epoch 8 - iter 792/1984 - loss 0.02090114 - time (sec): 36.44 - samples/sec: 1831.96 - lr: 0.000014 - momentum: 0.000000 2023-10-14 00:07:00,918 epoch 8 - iter 990/1984 - loss 0.02167482 - time (sec): 45.46 - samples/sec: 1805.64 - lr: 0.000014 - momentum: 0.000000 2023-10-14 00:07:10,067 epoch 8 - iter 1188/1984 - loss 0.02111459 - time (sec): 54.61 - samples/sec: 1802.67 - lr: 0.000013 - momentum: 0.000000 2023-10-14 00:07:19,289 epoch 8 - iter 1386/1984 - loss 0.02001664 - time (sec): 63.83 - samples/sec: 1799.08 - lr: 0.000013 - momentum: 0.000000 2023-10-14 00:07:28,458 epoch 8 - iter 1584/1984 - loss 0.01937469 - time (sec): 73.00 - samples/sec: 1800.62 - lr: 0.000012 - momentum: 0.000000 2023-10-14 00:07:37,598 epoch 8 - iter 1782/1984 - loss 0.01890756 - time (sec): 82.14 - samples/sec: 1803.09 - lr: 0.000012 - momentum: 0.000000 2023-10-14 00:07:46,695 epoch 8 - iter 1980/1984 - loss 0.01916413 - time (sec): 91.24 - samples/sec: 1793.56 - lr: 0.000011 - momentum: 0.000000 2023-10-14 00:07:46,880 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:07:46,880 EPOCH 8 done: loss 0.0191 - lr: 0.000011 2023-10-14 00:07:50,764 DEV : loss 0.20701323449611664 - f1-score (micro avg) 0.7508 2023-10-14 00:07:50,785 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:07:59,690 epoch 9 - iter 198/1984 - loss 0.01683448 - time (sec): 8.90 - samples/sec: 1786.85 - lr: 0.000011 - momentum: 0.000000 2023-10-14 00:08:08,744 epoch 9 - iter 396/1984 - loss 0.01424361 - time (sec): 17.96 - samples/sec: 1797.04 - lr: 0.000010 - momentum: 0.000000 2023-10-14 00:08:17,764 epoch 9 - iter 594/1984 - loss 0.01329647 - time (sec): 26.98 - samples/sec: 1831.07 - lr: 0.000009 - momentum: 0.000000 2023-10-14 00:08:26,715 epoch 9 - iter 792/1984 - loss 0.01361289 - time (sec): 35.93 - samples/sec: 1835.51 - lr: 0.000009 - momentum: 0.000000 2023-10-14 00:08:35,693 epoch 9 - iter 990/1984 - loss 0.01248769 - time (sec): 44.91 - samples/sec: 1828.13 - lr: 0.000008 - momentum: 0.000000 2023-10-14 00:08:44,665 epoch 9 - iter 1188/1984 - loss 0.01303478 - time (sec): 53.88 - samples/sec: 1828.78 - lr: 0.000008 - momentum: 0.000000 2023-10-14 00:08:53,901 epoch 9 - iter 1386/1984 - loss 0.01248985 - time (sec): 63.12 - samples/sec: 1814.35 - lr: 0.000007 - momentum: 0.000000 2023-10-14 00:09:02,963 epoch 9 - iter 1584/1984 - loss 0.01249750 - time (sec): 72.18 - samples/sec: 1813.72 - lr: 0.000007 - momentum: 0.000000 2023-10-14 00:09:12,129 epoch 9 - iter 1782/1984 - loss 0.01229768 - time (sec): 81.34 - samples/sec: 1806.58 - lr: 0.000006 - momentum: 0.000000 2023-10-14 00:09:21,106 epoch 9 - iter 1980/1984 - loss 0.01212657 - time (sec): 90.32 - samples/sec: 1811.73 - lr: 0.000006 - momentum: 0.000000 2023-10-14 00:09:21,297 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:09:21,297 EPOCH 9 done: loss 0.0122 - lr: 0.000006 2023-10-14 00:09:24,760 DEV : loss 0.24803374707698822 - f1-score (micro avg) 0.7598 2023-10-14 00:09:24,781 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:09:34,131 epoch 10 - iter 198/1984 - loss 0.01047072 - time (sec): 9.35 - samples/sec: 1835.03 - lr: 0.000005 - momentum: 0.000000 2023-10-14 00:09:43,100 epoch 10 - iter 396/1984 - loss 0.00852613 - time (sec): 18.32 - samples/sec: 1812.58 - lr: 0.000004 - momentum: 0.000000 2023-10-14 00:09:52,160 epoch 10 - iter 594/1984 - loss 0.00813734 - time (sec): 27.38 - samples/sec: 1818.87 - lr: 0.000004 - momentum: 0.000000 2023-10-14 00:10:01,309 epoch 10 - iter 792/1984 - loss 0.00748039 - time (sec): 36.53 - samples/sec: 1827.01 - lr: 0.000003 - momentum: 0.000000 2023-10-14 00:10:10,508 epoch 10 - iter 990/1984 - loss 0.00764780 - time (sec): 45.73 - samples/sec: 1837.44 - lr: 0.000003 - momentum: 0.000000 2023-10-14 00:10:19,392 epoch 10 - iter 1188/1984 - loss 0.00762812 - time (sec): 54.61 - samples/sec: 1832.58 - lr: 0.000002 - momentum: 0.000000 2023-10-14 00:10:28,228 epoch 10 - iter 1386/1984 - loss 0.00802417 - time (sec): 63.44 - samples/sec: 1820.05 - lr: 0.000002 - momentum: 0.000000 2023-10-14 00:10:37,140 epoch 10 - iter 1584/1984 - loss 0.00806189 - time (sec): 72.36 - samples/sec: 1809.06 - lr: 0.000001 - momentum: 0.000000 2023-10-14 00:10:46,228 epoch 10 - iter 1782/1984 - loss 0.00772889 - time (sec): 81.44 - samples/sec: 1803.08 - lr: 0.000001 - momentum: 0.000000 2023-10-14 00:10:55,215 epoch 10 - iter 1980/1984 - loss 0.00739086 - time (sec): 90.43 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000 2023-10-14 00:10:55,397 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:10:55,397 EPOCH 10 done: loss 0.0074 - lr: 0.000000 2023-10-14 00:10:59,305 DEV : loss 0.24994361400604248 - f1-score (micro avg) 0.7672 2023-10-14 00:10:59,753 ---------------------------------------------------------------------------------------------------- 2023-10-14 00:10:59,754 Loading model from best epoch ... 2023-10-14 00:11:01,093 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 00:11:04,428 Results: - F-score (micro) 0.7713 - F-score (macro) 0.6774 - Accuracy 0.6506 By class: precision recall f1-score support LOC 0.8082 0.8427 0.8251 655 PER 0.6825 0.8386 0.7525 223 ORG 0.6338 0.3543 0.4545 127 micro avg 0.7626 0.7801 0.7713 1005 macro avg 0.7082 0.6785 0.6774 1005 weighted avg 0.7583 0.7801 0.7622 1005 2023-10-14 00:11:04,428 ----------------------------------------------------------------------------------------------------