|
2023-10-13 22:44:08,972 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,973 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,973 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,973 Train: 7936 sentences |
|
2023-10-13 22:44:08,973 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,973 Training Params: |
|
2023-10-13 22:44:08,973 - learning_rate: "3e-05" |
|
2023-10-13 22:44:08,973 - mini_batch_size: "4" |
|
2023-10-13 22:44:08,973 - max_epochs: "10" |
|
2023-10-13 22:44:08,973 - shuffle: "True" |
|
2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,973 Plugins: |
|
2023-10-13 22:44:08,973 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,974 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 22:44:08,974 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,974 Computation: |
|
2023-10-13 22:44:08,974 - compute on device: cuda:0 |
|
2023-10-13 22:44:08,974 - embedding storage: none |
|
2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,974 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:44:18,081 epoch 1 - iter 198/1984 - loss 1.81064836 - time (sec): 9.11 - samples/sec: 1829.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 22:44:27,066 epoch 1 - iter 396/1984 - loss 1.08951470 - time (sec): 18.09 - samples/sec: 1811.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 22:44:36,144 epoch 1 - iter 594/1984 - loss 0.79725090 - time (sec): 27.17 - samples/sec: 1822.29 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 22:44:45,070 epoch 1 - iter 792/1984 - loss 0.65426686 - time (sec): 36.10 - samples/sec: 1820.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 22:44:54,112 epoch 1 - iter 990/1984 - loss 0.55873840 - time (sec): 45.14 - samples/sec: 1826.09 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 22:45:02,966 epoch 1 - iter 1188/1984 - loss 0.49032838 - time (sec): 53.99 - samples/sec: 1833.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 22:45:11,931 epoch 1 - iter 1386/1984 - loss 0.44094947 - time (sec): 62.96 - samples/sec: 1830.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 22:45:20,963 epoch 1 - iter 1584/1984 - loss 0.40474175 - time (sec): 71.99 - samples/sec: 1830.66 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 22:45:30,256 epoch 1 - iter 1782/1984 - loss 0.37525747 - time (sec): 81.28 - samples/sec: 1818.44 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 22:45:39,648 epoch 1 - iter 1980/1984 - loss 0.35281987 - time (sec): 90.67 - samples/sec: 1805.98 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 22:45:39,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:45:39,839 EPOCH 1 done: loss 0.3526 - lr: 0.000030 |
|
2023-10-13 22:45:42,914 DEV : loss 0.10146976262331009 - f1-score (micro avg) 0.7264 |
|
2023-10-13 22:45:42,937 saving best model |
|
2023-10-13 22:45:43,336 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:45:52,299 epoch 2 - iter 198/1984 - loss 0.13085115 - time (sec): 8.96 - samples/sec: 1965.02 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 22:46:01,550 epoch 2 - iter 396/1984 - loss 0.12180186 - time (sec): 18.21 - samples/sec: 1845.38 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 22:46:10,520 epoch 2 - iter 594/1984 - loss 0.11729603 - time (sec): 27.18 - samples/sec: 1843.65 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 22:46:19,430 epoch 2 - iter 792/1984 - loss 0.11624537 - time (sec): 36.09 - samples/sec: 1816.81 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 22:46:28,458 epoch 2 - iter 990/1984 - loss 0.11635377 - time (sec): 45.12 - samples/sec: 1810.56 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 22:46:37,650 epoch 2 - iter 1188/1984 - loss 0.11405465 - time (sec): 54.31 - samples/sec: 1800.34 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 22:46:46,738 epoch 2 - iter 1386/1984 - loss 0.11342189 - time (sec): 63.40 - samples/sec: 1795.44 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 22:46:55,776 epoch 2 - iter 1584/1984 - loss 0.11424700 - time (sec): 72.44 - samples/sec: 1803.73 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 22:47:04,814 epoch 2 - iter 1782/1984 - loss 0.11408106 - time (sec): 81.48 - samples/sec: 1794.31 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 22:47:13,932 epoch 2 - iter 1980/1984 - loss 0.11249722 - time (sec): 90.59 - samples/sec: 1807.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 22:47:14,114 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:47:14,114 EPOCH 2 done: loss 0.1124 - lr: 0.000027 |
|
2023-10-13 22:47:17,587 DEV : loss 0.09548649936914444 - f1-score (micro avg) 0.744 |
|
2023-10-13 22:47:17,608 saving best model |
|
2023-10-13 22:47:18,148 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:47:27,115 epoch 3 - iter 198/1984 - loss 0.06800668 - time (sec): 8.96 - samples/sec: 1822.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 22:47:36,104 epoch 3 - iter 396/1984 - loss 0.07109942 - time (sec): 17.95 - samples/sec: 1798.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 22:47:45,084 epoch 3 - iter 594/1984 - loss 0.07668389 - time (sec): 26.93 - samples/sec: 1795.77 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 22:47:54,089 epoch 3 - iter 792/1984 - loss 0.07789366 - time (sec): 35.94 - samples/sec: 1817.49 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 22:48:03,169 epoch 3 - iter 990/1984 - loss 0.07895567 - time (sec): 45.02 - samples/sec: 1825.49 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 22:48:12,181 epoch 3 - iter 1188/1984 - loss 0.08032932 - time (sec): 54.03 - samples/sec: 1818.24 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 22:48:21,733 epoch 3 - iter 1386/1984 - loss 0.08285120 - time (sec): 63.58 - samples/sec: 1802.90 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 22:48:30,700 epoch 3 - iter 1584/1984 - loss 0.08388123 - time (sec): 72.55 - samples/sec: 1799.06 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 22:48:39,738 epoch 3 - iter 1782/1984 - loss 0.08563997 - time (sec): 81.59 - samples/sec: 1800.26 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 22:48:48,772 epoch 3 - iter 1980/1984 - loss 0.08470780 - time (sec): 90.62 - samples/sec: 1805.95 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 22:48:48,950 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:48:48,950 EPOCH 3 done: loss 0.0847 - lr: 0.000023 |
|
2023-10-13 22:48:52,399 DEV : loss 0.12796364724636078 - f1-score (micro avg) 0.7407 |
|
2023-10-13 22:48:52,419 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:49:01,819 epoch 4 - iter 198/1984 - loss 0.05800278 - time (sec): 9.40 - samples/sec: 1812.92 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 22:49:10,901 epoch 4 - iter 396/1984 - loss 0.05641167 - time (sec): 18.48 - samples/sec: 1793.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 22:49:19,796 epoch 4 - iter 594/1984 - loss 0.05743702 - time (sec): 27.38 - samples/sec: 1787.23 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 22:49:28,726 epoch 4 - iter 792/1984 - loss 0.05915402 - time (sec): 36.31 - samples/sec: 1798.85 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 22:49:37,841 epoch 4 - iter 990/1984 - loss 0.05947704 - time (sec): 45.42 - samples/sec: 1795.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 22:49:46,872 epoch 4 - iter 1188/1984 - loss 0.06239465 - time (sec): 54.45 - samples/sec: 1801.43 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 22:49:55,856 epoch 4 - iter 1386/1984 - loss 0.06300882 - time (sec): 63.44 - samples/sec: 1807.04 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 22:50:04,872 epoch 4 - iter 1584/1984 - loss 0.06319248 - time (sec): 72.45 - samples/sec: 1804.06 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 22:50:13,909 epoch 4 - iter 1782/1984 - loss 0.06395252 - time (sec): 81.49 - samples/sec: 1810.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 22:50:22,853 epoch 4 - iter 1980/1984 - loss 0.06299855 - time (sec): 90.43 - samples/sec: 1811.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 22:50:23,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:50:23,030 EPOCH 4 done: loss 0.0631 - lr: 0.000020 |
|
2023-10-13 22:50:26,447 DEV : loss 0.18204569816589355 - f1-score (micro avg) 0.738 |
|
2023-10-13 22:50:26,468 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:50:35,948 epoch 5 - iter 198/1984 - loss 0.04150094 - time (sec): 9.48 - samples/sec: 1789.33 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 22:50:45,154 epoch 5 - iter 396/1984 - loss 0.04233067 - time (sec): 18.69 - samples/sec: 1769.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 22:50:54,257 epoch 5 - iter 594/1984 - loss 0.04297481 - time (sec): 27.79 - samples/sec: 1815.26 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 22:51:03,331 epoch 5 - iter 792/1984 - loss 0.04244711 - time (sec): 36.86 - samples/sec: 1802.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 22:51:12,472 epoch 5 - iter 990/1984 - loss 0.04326527 - time (sec): 46.00 - samples/sec: 1802.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 22:51:21,600 epoch 5 - iter 1188/1984 - loss 0.04560535 - time (sec): 55.13 - samples/sec: 1794.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 22:51:30,456 epoch 5 - iter 1386/1984 - loss 0.04527611 - time (sec): 63.99 - samples/sec: 1788.59 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 22:51:39,401 epoch 5 - iter 1584/1984 - loss 0.04572642 - time (sec): 72.93 - samples/sec: 1790.69 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 22:51:48,362 epoch 5 - iter 1782/1984 - loss 0.04653709 - time (sec): 81.89 - samples/sec: 1806.98 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 22:51:57,204 epoch 5 - iter 1980/1984 - loss 0.04705024 - time (sec): 90.73 - samples/sec: 1801.81 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 22:51:57,437 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:51:57,437 EPOCH 5 done: loss 0.0470 - lr: 0.000017 |
|
2023-10-13 22:52:01,269 DEV : loss 0.18338747322559357 - f1-score (micro avg) 0.7485 |
|
2023-10-13 22:52:01,290 saving best model |
|
2023-10-13 22:52:01,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:52:10,662 epoch 6 - iter 198/1984 - loss 0.03423276 - time (sec): 8.85 - samples/sec: 1775.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 22:52:19,672 epoch 6 - iter 396/1984 - loss 0.03504659 - time (sec): 17.86 - samples/sec: 1783.45 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 22:52:28,638 epoch 6 - iter 594/1984 - loss 0.03522723 - time (sec): 26.83 - samples/sec: 1801.61 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 22:52:37,746 epoch 6 - iter 792/1984 - loss 0.03512497 - time (sec): 35.93 - samples/sec: 1812.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 22:52:46,705 epoch 6 - iter 990/1984 - loss 0.03574142 - time (sec): 44.89 - samples/sec: 1815.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 22:52:55,725 epoch 6 - iter 1188/1984 - loss 0.03636170 - time (sec): 53.91 - samples/sec: 1817.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 22:53:04,837 epoch 6 - iter 1386/1984 - loss 0.03573438 - time (sec): 63.03 - samples/sec: 1821.47 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 22:53:13,757 epoch 6 - iter 1584/1984 - loss 0.03570621 - time (sec): 71.95 - samples/sec: 1821.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 22:53:23,051 epoch 6 - iter 1782/1984 - loss 0.03561765 - time (sec): 81.24 - samples/sec: 1803.74 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 22:53:32,057 epoch 6 - iter 1980/1984 - loss 0.03560613 - time (sec): 90.25 - samples/sec: 1811.62 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 22:53:32,243 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:53:32,243 EPOCH 6 done: loss 0.0355 - lr: 0.000013 |
|
2023-10-13 22:53:35,667 DEV : loss 0.19200846552848816 - f1-score (micro avg) 0.7571 |
|
2023-10-13 22:53:35,690 saving best model |
|
2023-10-13 22:53:36,227 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:53:45,464 epoch 7 - iter 198/1984 - loss 0.01991158 - time (sec): 9.23 - samples/sec: 1858.36 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 22:53:54,516 epoch 7 - iter 396/1984 - loss 0.02651805 - time (sec): 18.28 - samples/sec: 1844.81 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 22:54:03,473 epoch 7 - iter 594/1984 - loss 0.02637781 - time (sec): 27.24 - samples/sec: 1848.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 22:54:12,481 epoch 7 - iter 792/1984 - loss 0.02677852 - time (sec): 36.25 - samples/sec: 1845.67 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 22:54:21,402 epoch 7 - iter 990/1984 - loss 0.02881749 - time (sec): 45.17 - samples/sec: 1842.80 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 22:54:30,402 epoch 7 - iter 1188/1984 - loss 0.02858223 - time (sec): 54.17 - samples/sec: 1853.37 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 22:54:39,379 epoch 7 - iter 1386/1984 - loss 0.02830102 - time (sec): 63.15 - samples/sec: 1833.93 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 22:54:48,384 epoch 7 - iter 1584/1984 - loss 0.02747017 - time (sec): 72.15 - samples/sec: 1827.82 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 22:54:57,360 epoch 7 - iter 1782/1984 - loss 0.02717714 - time (sec): 81.13 - samples/sec: 1826.93 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 22:55:06,509 epoch 7 - iter 1980/1984 - loss 0.02758899 - time (sec): 90.28 - samples/sec: 1812.63 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 22:55:06,690 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:55:06,690 EPOCH 7 done: loss 0.0275 - lr: 0.000010 |
|
2023-10-13 22:55:10,051 DEV : loss 0.20744635164737701 - f1-score (micro avg) 0.7666 |
|
2023-10-13 22:55:10,071 saving best model |
|
2023-10-13 22:55:10,919 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:55:19,880 epoch 8 - iter 198/1984 - loss 0.02610314 - time (sec): 8.96 - samples/sec: 1850.88 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 22:55:28,864 epoch 8 - iter 396/1984 - loss 0.02132016 - time (sec): 17.94 - samples/sec: 1827.81 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 22:55:38,088 epoch 8 - iter 594/1984 - loss 0.02012992 - time (sec): 27.17 - samples/sec: 1814.18 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 22:55:47,028 epoch 8 - iter 792/1984 - loss 0.02027899 - time (sec): 36.11 - samples/sec: 1807.99 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 22:55:56,139 epoch 8 - iter 990/1984 - loss 0.01932362 - time (sec): 45.22 - samples/sec: 1797.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 22:56:05,665 epoch 8 - iter 1188/1984 - loss 0.02059177 - time (sec): 54.74 - samples/sec: 1790.75 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 22:56:14,696 epoch 8 - iter 1386/1984 - loss 0.02038701 - time (sec): 63.78 - samples/sec: 1802.34 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 22:56:23,565 epoch 8 - iter 1584/1984 - loss 0.02069282 - time (sec): 72.64 - samples/sec: 1803.00 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 22:56:32,647 epoch 8 - iter 1782/1984 - loss 0.02056342 - time (sec): 81.73 - samples/sec: 1798.72 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 22:56:41,597 epoch 8 - iter 1980/1984 - loss 0.02020247 - time (sec): 90.68 - samples/sec: 1804.78 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 22:56:41,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:56:41,779 EPOCH 8 done: loss 0.0203 - lr: 0.000007 |
|
2023-10-13 22:56:45,247 DEV : loss 0.2160894274711609 - f1-score (micro avg) 0.7539 |
|
2023-10-13 22:56:45,268 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:56:54,505 epoch 9 - iter 198/1984 - loss 0.00746271 - time (sec): 9.24 - samples/sec: 1666.05 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 22:57:03,468 epoch 9 - iter 396/1984 - loss 0.01246335 - time (sec): 18.20 - samples/sec: 1713.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 22:57:12,460 epoch 9 - iter 594/1984 - loss 0.01222490 - time (sec): 27.19 - samples/sec: 1776.85 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 22:57:21,470 epoch 9 - iter 792/1984 - loss 0.01220938 - time (sec): 36.20 - samples/sec: 1795.95 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 22:57:30,847 epoch 9 - iter 990/1984 - loss 0.01368129 - time (sec): 45.58 - samples/sec: 1814.95 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 22:57:39,889 epoch 9 - iter 1188/1984 - loss 0.01266492 - time (sec): 54.62 - samples/sec: 1814.26 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 22:57:49,004 epoch 9 - iter 1386/1984 - loss 0.01257140 - time (sec): 63.74 - samples/sec: 1808.68 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 22:57:58,054 epoch 9 - iter 1584/1984 - loss 0.01226365 - time (sec): 72.79 - samples/sec: 1807.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 22:58:06,967 epoch 9 - iter 1782/1984 - loss 0.01268987 - time (sec): 81.70 - samples/sec: 1802.17 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 22:58:15,967 epoch 9 - iter 1980/1984 - loss 0.01248644 - time (sec): 90.70 - samples/sec: 1804.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 22:58:16,146 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:58:16,146 EPOCH 9 done: loss 0.0126 - lr: 0.000003 |
|
2023-10-13 22:58:19,625 DEV : loss 0.22268585860729218 - f1-score (micro avg) 0.7629 |
|
2023-10-13 22:58:19,646 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:58:28,696 epoch 10 - iter 198/1984 - loss 0.01007200 - time (sec): 9.05 - samples/sec: 1701.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 22:58:37,663 epoch 10 - iter 396/1984 - loss 0.00885874 - time (sec): 18.02 - samples/sec: 1751.39 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 22:58:46,584 epoch 10 - iter 594/1984 - loss 0.00972978 - time (sec): 26.94 - samples/sec: 1752.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 22:58:55,648 epoch 10 - iter 792/1984 - loss 0.00994088 - time (sec): 36.00 - samples/sec: 1761.21 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 22:59:04,705 epoch 10 - iter 990/1984 - loss 0.01002788 - time (sec): 45.06 - samples/sec: 1769.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 22:59:13,711 epoch 10 - iter 1188/1984 - loss 0.00996187 - time (sec): 54.06 - samples/sec: 1790.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 22:59:22,835 epoch 10 - iter 1386/1984 - loss 0.00984624 - time (sec): 63.19 - samples/sec: 1811.97 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 22:59:31,920 epoch 10 - iter 1584/1984 - loss 0.00915830 - time (sec): 72.27 - samples/sec: 1815.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 22:59:41,048 epoch 10 - iter 1782/1984 - loss 0.00916070 - time (sec): 81.40 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 22:59:50,115 epoch 10 - iter 1980/1984 - loss 0.00968803 - time (sec): 90.47 - samples/sec: 1809.43 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 22:59:50,294 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:59:50,294 EPOCH 10 done: loss 0.0097 - lr: 0.000000 |
|
2023-10-13 22:59:54,151 DEV : loss 0.2287728190422058 - f1-score (micro avg) 0.7659 |
|
2023-10-13 22:59:54,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:59:54,608 Loading model from best epoch ... |
|
2023-10-13 22:59:56,028 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-13 22:59:59,358 |
|
Results: |
|
- F-score (micro) 0.7872 |
|
- F-score (macro) 0.6794 |
|
- Accuracy 0.6667 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8251 0.8718 0.8478 655 |
|
PER 0.7406 0.7937 0.7662 223 |
|
ORG 0.5915 0.3307 0.4242 127 |
|
|
|
micro avg 0.7884 0.7861 0.7872 1005 |
|
macro avg 0.7191 0.6654 0.6794 1005 |
|
weighted avg 0.7769 0.7861 0.7762 1005 |
|
|
|
2023-10-13 22:59:59,358 ---------------------------------------------------------------------------------------------------- |
|
|