|
2023-10-14 00:33:33,347 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,348 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 00:33:33,348 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 Train: 7936 sentences |
|
2023-10-14 00:33:33,349 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 Training Params: |
|
2023-10-14 00:33:33,349 - learning_rate: "3e-05" |
|
2023-10-14 00:33:33,349 - mini_batch_size: "4" |
|
2023-10-14 00:33:33,349 - max_epochs: "10" |
|
2023-10-14 00:33:33,349 - shuffle: "True" |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 Plugins: |
|
2023-10-14 00:33:33,349 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 00:33:33,349 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 Computation: |
|
2023-10-14 00:33:33,349 - compute on device: cuda:0 |
|
2023-10-14 00:33:33,349 - embedding storage: none |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:33,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:33:42,420 epoch 1 - iter 198/1984 - loss 1.84648731 - time (sec): 9.07 - samples/sec: 1702.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 00:33:51,452 epoch 1 - iter 396/1984 - loss 1.08861314 - time (sec): 18.10 - samples/sec: 1741.41 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 00:34:00,444 epoch 1 - iter 594/1984 - loss 0.79789327 - time (sec): 27.09 - samples/sec: 1775.55 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 00:34:09,335 epoch 1 - iter 792/1984 - loss 0.64340507 - time (sec): 35.98 - samples/sec: 1788.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 00:34:18,432 epoch 1 - iter 990/1984 - loss 0.55085242 - time (sec): 45.08 - samples/sec: 1792.80 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 00:34:27,429 epoch 1 - iter 1188/1984 - loss 0.48020788 - time (sec): 54.08 - samples/sec: 1809.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 00:34:36,495 epoch 1 - iter 1386/1984 - loss 0.43363784 - time (sec): 63.14 - samples/sec: 1801.24 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 00:34:45,534 epoch 1 - iter 1584/1984 - loss 0.39617398 - time (sec): 72.18 - samples/sec: 1801.87 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 00:34:54,461 epoch 1 - iter 1782/1984 - loss 0.36780931 - time (sec): 81.11 - samples/sec: 1803.46 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 00:35:03,459 epoch 1 - iter 1980/1984 - loss 0.34466223 - time (sec): 90.11 - samples/sec: 1813.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 00:35:03,679 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:35:03,679 EPOCH 1 done: loss 0.3439 - lr: 0.000030 |
|
2023-10-14 00:35:06,904 DEV : loss 0.11840364336967468 - f1-score (micro avg) 0.6206 |
|
2023-10-14 00:35:06,925 saving best model |
|
2023-10-14 00:35:07,303 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:35:16,413 epoch 2 - iter 198/1984 - loss 0.14075901 - time (sec): 9.11 - samples/sec: 1669.48 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 00:35:25,436 epoch 2 - iter 396/1984 - loss 0.12400082 - time (sec): 18.13 - samples/sec: 1720.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 00:35:34,763 epoch 2 - iter 594/1984 - loss 0.12219424 - time (sec): 27.46 - samples/sec: 1716.50 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 00:35:43,860 epoch 2 - iter 792/1984 - loss 0.11620520 - time (sec): 36.56 - samples/sec: 1741.22 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 00:35:52,817 epoch 2 - iter 990/1984 - loss 0.11433495 - time (sec): 45.51 - samples/sec: 1776.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 00:36:01,818 epoch 2 - iter 1188/1984 - loss 0.11548844 - time (sec): 54.51 - samples/sec: 1792.84 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 00:36:10,792 epoch 2 - iter 1386/1984 - loss 0.11442529 - time (sec): 63.49 - samples/sec: 1799.30 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 00:36:19,715 epoch 2 - iter 1584/1984 - loss 0.11261525 - time (sec): 72.41 - samples/sec: 1798.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 00:36:29,002 epoch 2 - iter 1782/1984 - loss 0.11148968 - time (sec): 81.70 - samples/sec: 1799.89 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 00:36:38,044 epoch 2 - iter 1980/1984 - loss 0.11272837 - time (sec): 90.74 - samples/sec: 1801.90 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 00:36:38,255 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:36:38,256 EPOCH 2 done: loss 0.1127 - lr: 0.000027 |
|
2023-10-14 00:36:41,653 DEV : loss 0.12019851058721542 - f1-score (micro avg) 0.7239 |
|
2023-10-14 00:36:41,675 saving best model |
|
2023-10-14 00:36:42,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:36:51,213 epoch 3 - iter 198/1984 - loss 0.06749262 - time (sec): 9.01 - samples/sec: 1676.79 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 00:37:00,237 epoch 3 - iter 396/1984 - loss 0.07556813 - time (sec): 18.03 - samples/sec: 1799.64 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 00:37:09,114 epoch 3 - iter 594/1984 - loss 0.07935076 - time (sec): 26.91 - samples/sec: 1779.72 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 00:37:18,151 epoch 3 - iter 792/1984 - loss 0.08102940 - time (sec): 35.94 - samples/sec: 1773.27 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 00:37:27,454 epoch 3 - iter 990/1984 - loss 0.07941764 - time (sec): 45.25 - samples/sec: 1800.13 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 00:37:36,997 epoch 3 - iter 1188/1984 - loss 0.08141394 - time (sec): 54.79 - samples/sec: 1785.56 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 00:37:46,204 epoch 3 - iter 1386/1984 - loss 0.08073553 - time (sec): 64.00 - samples/sec: 1790.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 00:37:55,490 epoch 3 - iter 1584/1984 - loss 0.08000645 - time (sec): 73.28 - samples/sec: 1792.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 00:38:04,457 epoch 3 - iter 1782/1984 - loss 0.08063126 - time (sec): 82.25 - samples/sec: 1786.95 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 00:38:13,357 epoch 3 - iter 1980/1984 - loss 0.08177117 - time (sec): 91.15 - samples/sec: 1794.98 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 00:38:13,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:38:13,537 EPOCH 3 done: loss 0.0817 - lr: 0.000023 |
|
2023-10-14 00:38:17,007 DEV : loss 0.1288405805826187 - f1-score (micro avg) 0.763 |
|
2023-10-14 00:38:17,032 saving best model |
|
2023-10-14 00:38:17,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:38:26,719 epoch 4 - iter 198/1984 - loss 0.05060713 - time (sec): 9.21 - samples/sec: 1901.80 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 00:38:35,936 epoch 4 - iter 396/1984 - loss 0.05243662 - time (sec): 18.43 - samples/sec: 1833.07 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 00:38:44,927 epoch 4 - iter 594/1984 - loss 0.05486600 - time (sec): 27.42 - samples/sec: 1826.93 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 00:38:54,019 epoch 4 - iter 792/1984 - loss 0.05512558 - time (sec): 36.51 - samples/sec: 1809.34 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 00:39:03,272 epoch 4 - iter 990/1984 - loss 0.05454064 - time (sec): 45.76 - samples/sec: 1807.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 00:39:12,825 epoch 4 - iter 1188/1984 - loss 0.05627708 - time (sec): 55.32 - samples/sec: 1785.28 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 00:39:21,891 epoch 4 - iter 1386/1984 - loss 0.05694993 - time (sec): 64.38 - samples/sec: 1775.39 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 00:39:30,648 epoch 4 - iter 1584/1984 - loss 0.05757246 - time (sec): 73.14 - samples/sec: 1784.06 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 00:39:39,551 epoch 4 - iter 1782/1984 - loss 0.05843888 - time (sec): 82.04 - samples/sec: 1781.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 00:39:48,827 epoch 4 - iter 1980/1984 - loss 0.06142881 - time (sec): 91.32 - samples/sec: 1792.12 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 00:39:49,033 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:39:49,033 EPOCH 4 done: loss 0.0614 - lr: 0.000020 |
|
2023-10-14 00:39:52,419 DEV : loss 0.14116324484348297 - f1-score (micro avg) 0.7451 |
|
2023-10-14 00:39:52,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:40:01,484 epoch 5 - iter 198/1984 - loss 0.04525493 - time (sec): 9.04 - samples/sec: 1830.03 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 00:40:10,539 epoch 5 - iter 396/1984 - loss 0.04542360 - time (sec): 18.10 - samples/sec: 1842.65 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 00:40:19,522 epoch 5 - iter 594/1984 - loss 0.04551483 - time (sec): 27.08 - samples/sec: 1820.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 00:40:28,471 epoch 5 - iter 792/1984 - loss 0.04325209 - time (sec): 36.03 - samples/sec: 1828.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 00:40:37,434 epoch 5 - iter 990/1984 - loss 0.04250521 - time (sec): 44.99 - samples/sec: 1829.61 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 00:40:46,472 epoch 5 - iter 1188/1984 - loss 0.04318675 - time (sec): 54.03 - samples/sec: 1830.16 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 00:40:55,443 epoch 5 - iter 1386/1984 - loss 0.04436125 - time (sec): 63.00 - samples/sec: 1812.70 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 00:41:04,539 epoch 5 - iter 1584/1984 - loss 0.04521410 - time (sec): 72.10 - samples/sec: 1820.00 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 00:41:13,633 epoch 5 - iter 1782/1984 - loss 0.04534807 - time (sec): 81.19 - samples/sec: 1813.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 00:41:22,673 epoch 5 - iter 1980/1984 - loss 0.04610443 - time (sec): 90.23 - samples/sec: 1814.23 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 00:41:22,851 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:41:22,851 EPOCH 5 done: loss 0.0461 - lr: 0.000017 |
|
2023-10-14 00:41:26,783 DEV : loss 0.15994729101657867 - f1-score (micro avg) 0.7573 |
|
2023-10-14 00:41:26,804 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:41:36,094 epoch 6 - iter 198/1984 - loss 0.04012582 - time (sec): 9.29 - samples/sec: 1865.57 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 00:41:45,056 epoch 6 - iter 396/1984 - loss 0.03880028 - time (sec): 18.25 - samples/sec: 1830.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 00:41:54,099 epoch 6 - iter 594/1984 - loss 0.03591012 - time (sec): 27.29 - samples/sec: 1803.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 00:42:03,182 epoch 6 - iter 792/1984 - loss 0.03583443 - time (sec): 36.38 - samples/sec: 1810.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 00:42:12,153 epoch 6 - iter 990/1984 - loss 0.03618946 - time (sec): 45.35 - samples/sec: 1818.44 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 00:42:21,087 epoch 6 - iter 1188/1984 - loss 0.03540426 - time (sec): 54.28 - samples/sec: 1812.97 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 00:42:30,099 epoch 6 - iter 1386/1984 - loss 0.03591058 - time (sec): 63.29 - samples/sec: 1812.29 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 00:42:39,105 epoch 6 - iter 1584/1984 - loss 0.03527012 - time (sec): 72.30 - samples/sec: 1813.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 00:42:48,273 epoch 6 - iter 1782/1984 - loss 0.03466812 - time (sec): 81.47 - samples/sec: 1817.65 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 00:42:57,276 epoch 6 - iter 1980/1984 - loss 0.03459241 - time (sec): 90.47 - samples/sec: 1809.43 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 00:42:57,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:42:57,461 EPOCH 6 done: loss 0.0345 - lr: 0.000013 |
|
2023-10-14 00:43:00,862 DEV : loss 0.1856544464826584 - f1-score (micro avg) 0.7573 |
|
2023-10-14 00:43:00,883 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:43:09,962 epoch 7 - iter 198/1984 - loss 0.02135452 - time (sec): 9.08 - samples/sec: 1781.01 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 00:43:18,579 epoch 7 - iter 396/1984 - loss 0.02314302 - time (sec): 17.70 - samples/sec: 1814.63 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 00:43:27,249 epoch 7 - iter 594/1984 - loss 0.02014483 - time (sec): 26.36 - samples/sec: 1861.63 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 00:43:35,932 epoch 7 - iter 792/1984 - loss 0.02217636 - time (sec): 35.05 - samples/sec: 1872.99 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 00:43:44,530 epoch 7 - iter 990/1984 - loss 0.02248853 - time (sec): 43.65 - samples/sec: 1870.13 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 00:43:53,208 epoch 7 - iter 1188/1984 - loss 0.02281695 - time (sec): 52.32 - samples/sec: 1878.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 00:44:01,962 epoch 7 - iter 1386/1984 - loss 0.02260346 - time (sec): 61.08 - samples/sec: 1883.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 00:44:10,588 epoch 7 - iter 1584/1984 - loss 0.02346248 - time (sec): 69.70 - samples/sec: 1881.17 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 00:44:19,772 epoch 7 - iter 1782/1984 - loss 0.02378911 - time (sec): 78.89 - samples/sec: 1869.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 00:44:28,729 epoch 7 - iter 1980/1984 - loss 0.02373852 - time (sec): 87.84 - samples/sec: 1862.99 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 00:44:28,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:44:28,901 EPOCH 7 done: loss 0.0238 - lr: 0.000010 |
|
2023-10-14 00:44:32,838 DEV : loss 0.19240804016590118 - f1-score (micro avg) 0.7615 |
|
2023-10-14 00:44:32,859 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:44:41,845 epoch 8 - iter 198/1984 - loss 0.01575059 - time (sec): 8.98 - samples/sec: 1912.11 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 00:44:50,888 epoch 8 - iter 396/1984 - loss 0.01422090 - time (sec): 18.03 - samples/sec: 1844.50 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 00:44:59,843 epoch 8 - iter 594/1984 - loss 0.01545793 - time (sec): 26.98 - samples/sec: 1809.45 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 00:45:09,054 epoch 8 - iter 792/1984 - loss 0.01632885 - time (sec): 36.19 - samples/sec: 1826.74 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 00:45:18,131 epoch 8 - iter 990/1984 - loss 0.01657672 - time (sec): 45.27 - samples/sec: 1833.70 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 00:45:27,193 epoch 8 - iter 1188/1984 - loss 0.01688354 - time (sec): 54.33 - samples/sec: 1838.93 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 00:45:36,076 epoch 8 - iter 1386/1984 - loss 0.01653230 - time (sec): 63.22 - samples/sec: 1837.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 00:45:45,263 epoch 8 - iter 1584/1984 - loss 0.01631411 - time (sec): 72.40 - samples/sec: 1821.74 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 00:45:54,512 epoch 8 - iter 1782/1984 - loss 0.01666752 - time (sec): 81.65 - samples/sec: 1811.34 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 00:46:03,561 epoch 8 - iter 1980/1984 - loss 0.01708603 - time (sec): 90.70 - samples/sec: 1805.56 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 00:46:03,739 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:46:03,739 EPOCH 8 done: loss 0.0171 - lr: 0.000007 |
|
2023-10-14 00:46:07,144 DEV : loss 0.20113840699195862 - f1-score (micro avg) 0.7714 |
|
2023-10-14 00:46:07,165 saving best model |
|
2023-10-14 00:46:07,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:46:16,719 epoch 9 - iter 198/1984 - loss 0.01121190 - time (sec): 9.03 - samples/sec: 1791.90 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 00:46:25,702 epoch 9 - iter 396/1984 - loss 0.01335710 - time (sec): 18.01 - samples/sec: 1844.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 00:46:34,718 epoch 9 - iter 594/1984 - loss 0.01317916 - time (sec): 27.03 - samples/sec: 1851.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 00:46:43,597 epoch 9 - iter 792/1984 - loss 0.01271864 - time (sec): 35.91 - samples/sec: 1828.80 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 00:46:52,551 epoch 9 - iter 990/1984 - loss 0.01201250 - time (sec): 44.86 - samples/sec: 1834.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 00:47:01,618 epoch 9 - iter 1188/1984 - loss 0.01186694 - time (sec): 53.93 - samples/sec: 1830.03 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 00:47:10,838 epoch 9 - iter 1386/1984 - loss 0.01295976 - time (sec): 63.15 - samples/sec: 1817.33 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 00:47:19,883 epoch 9 - iter 1584/1984 - loss 0.01303216 - time (sec): 72.19 - samples/sec: 1825.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 00:47:28,749 epoch 9 - iter 1782/1984 - loss 0.01278500 - time (sec): 81.06 - samples/sec: 1821.39 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 00:47:37,683 epoch 9 - iter 1980/1984 - loss 0.01286673 - time (sec): 89.99 - samples/sec: 1817.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 00:47:37,879 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:47:37,879 EPOCH 9 done: loss 0.0128 - lr: 0.000003 |
|
2023-10-14 00:47:41,336 DEV : loss 0.2166709452867508 - f1-score (micro avg) 0.7711 |
|
2023-10-14 00:47:41,357 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:47:50,410 epoch 10 - iter 198/1984 - loss 0.00699997 - time (sec): 9.05 - samples/sec: 1941.31 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 00:47:59,477 epoch 10 - iter 396/1984 - loss 0.00743539 - time (sec): 18.12 - samples/sec: 1881.47 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 00:48:08,429 epoch 10 - iter 594/1984 - loss 0.00701043 - time (sec): 27.07 - samples/sec: 1816.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 00:48:17,379 epoch 10 - iter 792/1984 - loss 0.00691793 - time (sec): 36.02 - samples/sec: 1826.01 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 00:48:26,370 epoch 10 - iter 990/1984 - loss 0.00733336 - time (sec): 45.01 - samples/sec: 1829.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 00:48:35,376 epoch 10 - iter 1188/1984 - loss 0.00787838 - time (sec): 54.02 - samples/sec: 1821.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 00:48:44,332 epoch 10 - iter 1386/1984 - loss 0.00801026 - time (sec): 62.97 - samples/sec: 1820.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 00:48:53,392 epoch 10 - iter 1584/1984 - loss 0.00795308 - time (sec): 72.03 - samples/sec: 1822.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 00:49:02,320 epoch 10 - iter 1782/1984 - loss 0.00774112 - time (sec): 80.96 - samples/sec: 1826.42 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 00:49:11,494 epoch 10 - iter 1980/1984 - loss 0.00777068 - time (sec): 90.14 - samples/sec: 1816.02 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 00:49:11,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:49:11,673 EPOCH 10 done: loss 0.0078 - lr: 0.000000 |
|
2023-10-14 00:49:15,510 DEV : loss 0.22600804269313812 - f1-score (micro avg) 0.7689 |
|
2023-10-14 00:49:15,955 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:49:15,956 Loading model from best epoch ... |
|
2023-10-14 00:49:17,330 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-14 00:49:20,598 |
|
Results: |
|
- F-score (micro) 0.7769 |
|
- F-score (macro) 0.6893 |
|
- Accuracy 0.6567 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8162 0.8473 0.8315 655 |
|
PER 0.7083 0.8386 0.7680 223 |
|
ORG 0.5474 0.4094 0.4685 127 |
|
|
|
micro avg 0.7642 0.7900 0.7769 1005 |
|
macro avg 0.6906 0.6984 0.6893 1005 |
|
weighted avg 0.7583 0.7900 0.7715 1005 |
|
|
|
2023-10-14 00:49:20,598 ---------------------------------------------------------------------------------------------------- |
|
|