stefan-it's picture
Upload folder using huggingface_hub
812cdb2
raw
history blame
23.9 kB
2023-10-13 22:44:08,972 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,973 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,973 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,973 Train: 7936 sentences
2023-10-13 22:44:08,973 (train_with_dev=False, train_with_test=False)
2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,973 Training Params:
2023-10-13 22:44:08,973 - learning_rate: "3e-05"
2023-10-13 22:44:08,973 - mini_batch_size: "4"
2023-10-13 22:44:08,973 - max_epochs: "10"
2023-10-13 22:44:08,973 - shuffle: "True"
2023-10-13 22:44:08,973 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,973 Plugins:
2023-10-13 22:44:08,973 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,974 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 22:44:08,974 - metric: "('micro avg', 'f1-score')"
2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,974 Computation:
2023-10-13 22:44:08,974 - compute on device: cuda:0
2023-10-13 22:44:08,974 - embedding storage: none
2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,974 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:08,974 ----------------------------------------------------------------------------------------------------
2023-10-13 22:44:18,081 epoch 1 - iter 198/1984 - loss 1.81064836 - time (sec): 9.11 - samples/sec: 1829.54 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:44:27,066 epoch 1 - iter 396/1984 - loss 1.08951470 - time (sec): 18.09 - samples/sec: 1811.50 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:44:36,144 epoch 1 - iter 594/1984 - loss 0.79725090 - time (sec): 27.17 - samples/sec: 1822.29 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:44:45,070 epoch 1 - iter 792/1984 - loss 0.65426686 - time (sec): 36.10 - samples/sec: 1820.77 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:44:54,112 epoch 1 - iter 990/1984 - loss 0.55873840 - time (sec): 45.14 - samples/sec: 1826.09 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:45:02,966 epoch 1 - iter 1188/1984 - loss 0.49032838 - time (sec): 53.99 - samples/sec: 1833.75 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:45:11,931 epoch 1 - iter 1386/1984 - loss 0.44094947 - time (sec): 62.96 - samples/sec: 1830.34 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:45:20,963 epoch 1 - iter 1584/1984 - loss 0.40474175 - time (sec): 71.99 - samples/sec: 1830.66 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:45:30,256 epoch 1 - iter 1782/1984 - loss 0.37525747 - time (sec): 81.28 - samples/sec: 1818.44 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:45:39,648 epoch 1 - iter 1980/1984 - loss 0.35281987 - time (sec): 90.67 - samples/sec: 1805.98 - lr: 0.000030 - momentum: 0.000000
2023-10-13 22:45:39,839 ----------------------------------------------------------------------------------------------------
2023-10-13 22:45:39,839 EPOCH 1 done: loss 0.3526 - lr: 0.000030
2023-10-13 22:45:42,914 DEV : loss 0.10146976262331009 - f1-score (micro avg) 0.7264
2023-10-13 22:45:42,937 saving best model
2023-10-13 22:45:43,336 ----------------------------------------------------------------------------------------------------
2023-10-13 22:45:52,299 epoch 2 - iter 198/1984 - loss 0.13085115 - time (sec): 8.96 - samples/sec: 1965.02 - lr: 0.000030 - momentum: 0.000000
2023-10-13 22:46:01,550 epoch 2 - iter 396/1984 - loss 0.12180186 - time (sec): 18.21 - samples/sec: 1845.38 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:46:10,520 epoch 2 - iter 594/1984 - loss 0.11729603 - time (sec): 27.18 - samples/sec: 1843.65 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:46:19,430 epoch 2 - iter 792/1984 - loss 0.11624537 - time (sec): 36.09 - samples/sec: 1816.81 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:46:28,458 epoch 2 - iter 990/1984 - loss 0.11635377 - time (sec): 45.12 - samples/sec: 1810.56 - lr: 0.000028 - momentum: 0.000000
2023-10-13 22:46:37,650 epoch 2 - iter 1188/1984 - loss 0.11405465 - time (sec): 54.31 - samples/sec: 1800.34 - lr: 0.000028 - momentum: 0.000000
2023-10-13 22:46:46,738 epoch 2 - iter 1386/1984 - loss 0.11342189 - time (sec): 63.40 - samples/sec: 1795.44 - lr: 0.000028 - momentum: 0.000000
2023-10-13 22:46:55,776 epoch 2 - iter 1584/1984 - loss 0.11424700 - time (sec): 72.44 - samples/sec: 1803.73 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:47:04,814 epoch 2 - iter 1782/1984 - loss 0.11408106 - time (sec): 81.48 - samples/sec: 1794.31 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:47:13,932 epoch 2 - iter 1980/1984 - loss 0.11249722 - time (sec): 90.59 - samples/sec: 1807.49 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:47:14,114 ----------------------------------------------------------------------------------------------------
2023-10-13 22:47:14,114 EPOCH 2 done: loss 0.1124 - lr: 0.000027
2023-10-13 22:47:17,587 DEV : loss 0.09548649936914444 - f1-score (micro avg) 0.744
2023-10-13 22:47:17,608 saving best model
2023-10-13 22:47:18,148 ----------------------------------------------------------------------------------------------------
2023-10-13 22:47:27,115 epoch 3 - iter 198/1984 - loss 0.06800668 - time (sec): 8.96 - samples/sec: 1822.91 - lr: 0.000026 - momentum: 0.000000
2023-10-13 22:47:36,104 epoch 3 - iter 396/1984 - loss 0.07109942 - time (sec): 17.95 - samples/sec: 1798.42 - lr: 0.000026 - momentum: 0.000000
2023-10-13 22:47:45,084 epoch 3 - iter 594/1984 - loss 0.07668389 - time (sec): 26.93 - samples/sec: 1795.77 - lr: 0.000026 - momentum: 0.000000
2023-10-13 22:47:54,089 epoch 3 - iter 792/1984 - loss 0.07789366 - time (sec): 35.94 - samples/sec: 1817.49 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:48:03,169 epoch 3 - iter 990/1984 - loss 0.07895567 - time (sec): 45.02 - samples/sec: 1825.49 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:48:12,181 epoch 3 - iter 1188/1984 - loss 0.08032932 - time (sec): 54.03 - samples/sec: 1818.24 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:48:21,733 epoch 3 - iter 1386/1984 - loss 0.08285120 - time (sec): 63.58 - samples/sec: 1802.90 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:48:30,700 epoch 3 - iter 1584/1984 - loss 0.08388123 - time (sec): 72.55 - samples/sec: 1799.06 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:48:39,738 epoch 3 - iter 1782/1984 - loss 0.08563997 - time (sec): 81.59 - samples/sec: 1800.26 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:48:48,772 epoch 3 - iter 1980/1984 - loss 0.08470780 - time (sec): 90.62 - samples/sec: 1805.95 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:48:48,950 ----------------------------------------------------------------------------------------------------
2023-10-13 22:48:48,950 EPOCH 3 done: loss 0.0847 - lr: 0.000023
2023-10-13 22:48:52,399 DEV : loss 0.12796364724636078 - f1-score (micro avg) 0.7407
2023-10-13 22:48:52,419 ----------------------------------------------------------------------------------------------------
2023-10-13 22:49:01,819 epoch 4 - iter 198/1984 - loss 0.05800278 - time (sec): 9.40 - samples/sec: 1812.92 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:49:10,901 epoch 4 - iter 396/1984 - loss 0.05641167 - time (sec): 18.48 - samples/sec: 1793.44 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:49:19,796 epoch 4 - iter 594/1984 - loss 0.05743702 - time (sec): 27.38 - samples/sec: 1787.23 - lr: 0.000022 - momentum: 0.000000
2023-10-13 22:49:28,726 epoch 4 - iter 792/1984 - loss 0.05915402 - time (sec): 36.31 - samples/sec: 1798.85 - lr: 0.000022 - momentum: 0.000000
2023-10-13 22:49:37,841 epoch 4 - iter 990/1984 - loss 0.05947704 - time (sec): 45.42 - samples/sec: 1795.29 - lr: 0.000022 - momentum: 0.000000
2023-10-13 22:49:46,872 epoch 4 - iter 1188/1984 - loss 0.06239465 - time (sec): 54.45 - samples/sec: 1801.43 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:49:55,856 epoch 4 - iter 1386/1984 - loss 0.06300882 - time (sec): 63.44 - samples/sec: 1807.04 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:50:04,872 epoch 4 - iter 1584/1984 - loss 0.06319248 - time (sec): 72.45 - samples/sec: 1804.06 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:50:13,909 epoch 4 - iter 1782/1984 - loss 0.06395252 - time (sec): 81.49 - samples/sec: 1810.75 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:50:22,853 epoch 4 - iter 1980/1984 - loss 0.06299855 - time (sec): 90.43 - samples/sec: 1811.89 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:50:23,030 ----------------------------------------------------------------------------------------------------
2023-10-13 22:50:23,030 EPOCH 4 done: loss 0.0631 - lr: 0.000020
2023-10-13 22:50:26,447 DEV : loss 0.18204569816589355 - f1-score (micro avg) 0.738
2023-10-13 22:50:26,468 ----------------------------------------------------------------------------------------------------
2023-10-13 22:50:35,948 epoch 5 - iter 198/1984 - loss 0.04150094 - time (sec): 9.48 - samples/sec: 1789.33 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:50:45,154 epoch 5 - iter 396/1984 - loss 0.04233067 - time (sec): 18.69 - samples/sec: 1769.03 - lr: 0.000019 - momentum: 0.000000
2023-10-13 22:50:54,257 epoch 5 - iter 594/1984 - loss 0.04297481 - time (sec): 27.79 - samples/sec: 1815.26 - lr: 0.000019 - momentum: 0.000000
2023-10-13 22:51:03,331 epoch 5 - iter 792/1984 - loss 0.04244711 - time (sec): 36.86 - samples/sec: 1802.29 - lr: 0.000019 - momentum: 0.000000
2023-10-13 22:51:12,472 epoch 5 - iter 990/1984 - loss 0.04326527 - time (sec): 46.00 - samples/sec: 1802.41 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:51:21,600 epoch 5 - iter 1188/1984 - loss 0.04560535 - time (sec): 55.13 - samples/sec: 1794.02 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:51:30,456 epoch 5 - iter 1386/1984 - loss 0.04527611 - time (sec): 63.99 - samples/sec: 1788.59 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:51:39,401 epoch 5 - iter 1584/1984 - loss 0.04572642 - time (sec): 72.93 - samples/sec: 1790.69 - lr: 0.000017 - momentum: 0.000000
2023-10-13 22:51:48,362 epoch 5 - iter 1782/1984 - loss 0.04653709 - time (sec): 81.89 - samples/sec: 1806.98 - lr: 0.000017 - momentum: 0.000000
2023-10-13 22:51:57,204 epoch 5 - iter 1980/1984 - loss 0.04705024 - time (sec): 90.73 - samples/sec: 1801.81 - lr: 0.000017 - momentum: 0.000000
2023-10-13 22:51:57,437 ----------------------------------------------------------------------------------------------------
2023-10-13 22:51:57,437 EPOCH 5 done: loss 0.0470 - lr: 0.000017
2023-10-13 22:52:01,269 DEV : loss 0.18338747322559357 - f1-score (micro avg) 0.7485
2023-10-13 22:52:01,290 saving best model
2023-10-13 22:52:01,805 ----------------------------------------------------------------------------------------------------
2023-10-13 22:52:10,662 epoch 6 - iter 198/1984 - loss 0.03423276 - time (sec): 8.85 - samples/sec: 1775.29 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:52:19,672 epoch 6 - iter 396/1984 - loss 0.03504659 - time (sec): 17.86 - samples/sec: 1783.45 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:52:28,638 epoch 6 - iter 594/1984 - loss 0.03522723 - time (sec): 26.83 - samples/sec: 1801.61 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:52:37,746 epoch 6 - iter 792/1984 - loss 0.03512497 - time (sec): 35.93 - samples/sec: 1812.74 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:52:46,705 epoch 6 - iter 990/1984 - loss 0.03574142 - time (sec): 44.89 - samples/sec: 1815.89 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:52:55,725 epoch 6 - iter 1188/1984 - loss 0.03636170 - time (sec): 53.91 - samples/sec: 1817.02 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:53:04,837 epoch 6 - iter 1386/1984 - loss 0.03573438 - time (sec): 63.03 - samples/sec: 1821.47 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:53:13,757 epoch 6 - iter 1584/1984 - loss 0.03570621 - time (sec): 71.95 - samples/sec: 1821.80 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:53:23,051 epoch 6 - iter 1782/1984 - loss 0.03561765 - time (sec): 81.24 - samples/sec: 1803.74 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:53:32,057 epoch 6 - iter 1980/1984 - loss 0.03560613 - time (sec): 90.25 - samples/sec: 1811.62 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:53:32,243 ----------------------------------------------------------------------------------------------------
2023-10-13 22:53:32,243 EPOCH 6 done: loss 0.0355 - lr: 0.000013
2023-10-13 22:53:35,667 DEV : loss 0.19200846552848816 - f1-score (micro avg) 0.7571
2023-10-13 22:53:35,690 saving best model
2023-10-13 22:53:36,227 ----------------------------------------------------------------------------------------------------
2023-10-13 22:53:45,464 epoch 7 - iter 198/1984 - loss 0.01991158 - time (sec): 9.23 - samples/sec: 1858.36 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:53:54,516 epoch 7 - iter 396/1984 - loss 0.02651805 - time (sec): 18.28 - samples/sec: 1844.81 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:54:03,473 epoch 7 - iter 594/1984 - loss 0.02637781 - time (sec): 27.24 - samples/sec: 1848.97 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:54:12,481 epoch 7 - iter 792/1984 - loss 0.02677852 - time (sec): 36.25 - samples/sec: 1845.67 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:54:21,402 epoch 7 - iter 990/1984 - loss 0.02881749 - time (sec): 45.17 - samples/sec: 1842.80 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:54:30,402 epoch 7 - iter 1188/1984 - loss 0.02858223 - time (sec): 54.17 - samples/sec: 1853.37 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:54:39,379 epoch 7 - iter 1386/1984 - loss 0.02830102 - time (sec): 63.15 - samples/sec: 1833.93 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:54:48,384 epoch 7 - iter 1584/1984 - loss 0.02747017 - time (sec): 72.15 - samples/sec: 1827.82 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:54:57,360 epoch 7 - iter 1782/1984 - loss 0.02717714 - time (sec): 81.13 - samples/sec: 1826.93 - lr: 0.000010 - momentum: 0.000000
2023-10-13 22:55:06,509 epoch 7 - iter 1980/1984 - loss 0.02758899 - time (sec): 90.28 - samples/sec: 1812.63 - lr: 0.000010 - momentum: 0.000000
2023-10-13 22:55:06,690 ----------------------------------------------------------------------------------------------------
2023-10-13 22:55:06,690 EPOCH 7 done: loss 0.0275 - lr: 0.000010
2023-10-13 22:55:10,051 DEV : loss 0.20744635164737701 - f1-score (micro avg) 0.7666
2023-10-13 22:55:10,071 saving best model
2023-10-13 22:55:10,919 ----------------------------------------------------------------------------------------------------
2023-10-13 22:55:19,880 epoch 8 - iter 198/1984 - loss 0.02610314 - time (sec): 8.96 - samples/sec: 1850.88 - lr: 0.000010 - momentum: 0.000000
2023-10-13 22:55:28,864 epoch 8 - iter 396/1984 - loss 0.02132016 - time (sec): 17.94 - samples/sec: 1827.81 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:55:38,088 epoch 8 - iter 594/1984 - loss 0.02012992 - time (sec): 27.17 - samples/sec: 1814.18 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:55:47,028 epoch 8 - iter 792/1984 - loss 0.02027899 - time (sec): 36.11 - samples/sec: 1807.99 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:55:56,139 epoch 8 - iter 990/1984 - loss 0.01932362 - time (sec): 45.22 - samples/sec: 1797.05 - lr: 0.000008 - momentum: 0.000000
2023-10-13 22:56:05,665 epoch 8 - iter 1188/1984 - loss 0.02059177 - time (sec): 54.74 - samples/sec: 1790.75 - lr: 0.000008 - momentum: 0.000000
2023-10-13 22:56:14,696 epoch 8 - iter 1386/1984 - loss 0.02038701 - time (sec): 63.78 - samples/sec: 1802.34 - lr: 0.000008 - momentum: 0.000000
2023-10-13 22:56:23,565 epoch 8 - iter 1584/1984 - loss 0.02069282 - time (sec): 72.64 - samples/sec: 1803.00 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:56:32,647 epoch 8 - iter 1782/1984 - loss 0.02056342 - time (sec): 81.73 - samples/sec: 1798.72 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:56:41,597 epoch 8 - iter 1980/1984 - loss 0.02020247 - time (sec): 90.68 - samples/sec: 1804.78 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:56:41,779 ----------------------------------------------------------------------------------------------------
2023-10-13 22:56:41,779 EPOCH 8 done: loss 0.0203 - lr: 0.000007
2023-10-13 22:56:45,247 DEV : loss 0.2160894274711609 - f1-score (micro avg) 0.7539
2023-10-13 22:56:45,268 ----------------------------------------------------------------------------------------------------
2023-10-13 22:56:54,505 epoch 9 - iter 198/1984 - loss 0.00746271 - time (sec): 9.24 - samples/sec: 1666.05 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:57:03,468 epoch 9 - iter 396/1984 - loss 0.01246335 - time (sec): 18.20 - samples/sec: 1713.75 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:57:12,460 epoch 9 - iter 594/1984 - loss 0.01222490 - time (sec): 27.19 - samples/sec: 1776.85 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:57:21,470 epoch 9 - iter 792/1984 - loss 0.01220938 - time (sec): 36.20 - samples/sec: 1795.95 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:57:30,847 epoch 9 - iter 990/1984 - loss 0.01368129 - time (sec): 45.58 - samples/sec: 1814.95 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:57:39,889 epoch 9 - iter 1188/1984 - loss 0.01266492 - time (sec): 54.62 - samples/sec: 1814.26 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:57:49,004 epoch 9 - iter 1386/1984 - loss 0.01257140 - time (sec): 63.74 - samples/sec: 1808.68 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:57:58,054 epoch 9 - iter 1584/1984 - loss 0.01226365 - time (sec): 72.79 - samples/sec: 1807.51 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:58:06,967 epoch 9 - iter 1782/1984 - loss 0.01268987 - time (sec): 81.70 - samples/sec: 1802.17 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:58:15,967 epoch 9 - iter 1980/1984 - loss 0.01248644 - time (sec): 90.70 - samples/sec: 1804.81 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:58:16,146 ----------------------------------------------------------------------------------------------------
2023-10-13 22:58:16,146 EPOCH 9 done: loss 0.0126 - lr: 0.000003
2023-10-13 22:58:19,625 DEV : loss 0.22268585860729218 - f1-score (micro avg) 0.7629
2023-10-13 22:58:19,646 ----------------------------------------------------------------------------------------------------
2023-10-13 22:58:28,696 epoch 10 - iter 198/1984 - loss 0.01007200 - time (sec): 9.05 - samples/sec: 1701.45 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:58:37,663 epoch 10 - iter 396/1984 - loss 0.00885874 - time (sec): 18.02 - samples/sec: 1751.39 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:58:46,584 epoch 10 - iter 594/1984 - loss 0.00972978 - time (sec): 26.94 - samples/sec: 1752.48 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:58:55,648 epoch 10 - iter 792/1984 - loss 0.00994088 - time (sec): 36.00 - samples/sec: 1761.21 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:59:04,705 epoch 10 - iter 990/1984 - loss 0.01002788 - time (sec): 45.06 - samples/sec: 1769.36 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:59:13,711 epoch 10 - iter 1188/1984 - loss 0.00996187 - time (sec): 54.06 - samples/sec: 1790.42 - lr: 0.000001 - momentum: 0.000000
2023-10-13 22:59:22,835 epoch 10 - iter 1386/1984 - loss 0.00984624 - time (sec): 63.19 - samples/sec: 1811.97 - lr: 0.000001 - momentum: 0.000000
2023-10-13 22:59:31,920 epoch 10 - iter 1584/1984 - loss 0.00915830 - time (sec): 72.27 - samples/sec: 1815.67 - lr: 0.000001 - momentum: 0.000000
2023-10-13 22:59:41,048 epoch 10 - iter 1782/1984 - loss 0.00916070 - time (sec): 81.40 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000
2023-10-13 22:59:50,115 epoch 10 - iter 1980/1984 - loss 0.00968803 - time (sec): 90.47 - samples/sec: 1809.43 - lr: 0.000000 - momentum: 0.000000
2023-10-13 22:59:50,294 ----------------------------------------------------------------------------------------------------
2023-10-13 22:59:50,294 EPOCH 10 done: loss 0.0097 - lr: 0.000000
2023-10-13 22:59:54,151 DEV : loss 0.2287728190422058 - f1-score (micro avg) 0.7659
2023-10-13 22:59:54,607 ----------------------------------------------------------------------------------------------------
2023-10-13 22:59:54,608 Loading model from best epoch ...
2023-10-13 22:59:56,028 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 22:59:59,358
Results:
- F-score (micro) 0.7872
- F-score (macro) 0.6794
- Accuracy 0.6667
By class:
precision recall f1-score support
LOC 0.8251 0.8718 0.8478 655
PER 0.7406 0.7937 0.7662 223
ORG 0.5915 0.3307 0.4242 127
micro avg 0.7884 0.7861 0.7872 1005
macro avg 0.7191 0.6654 0.6794 1005
weighted avg 0.7769 0.7861 0.7762 1005
2023-10-13 22:59:59,358 ----------------------------------------------------------------------------------------------------