|
2023-10-13 23:55:15,562 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,563 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 23:55:15,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 Train: 7936 sentences |
|
2023-10-13 23:55:15,564 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 Training Params: |
|
2023-10-13 23:55:15,564 - learning_rate: "5e-05" |
|
2023-10-13 23:55:15,564 - mini_batch_size: "4" |
|
2023-10-13 23:55:15,564 - max_epochs: "10" |
|
2023-10-13 23:55:15,564 - shuffle: "True" |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 Plugins: |
|
2023-10-13 23:55:15,564 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 23:55:15,564 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 Computation: |
|
2023-10-13 23:55:15,564 - compute on device: cuda:0 |
|
2023-10-13 23:55:15,564 - embedding storage: none |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:15,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:55:24,306 epoch 1 - iter 198/1984 - loss 1.50772186 - time (sec): 8.74 - samples/sec: 1863.43 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 23:55:32,982 epoch 1 - iter 396/1984 - loss 0.90184360 - time (sec): 17.42 - samples/sec: 1859.29 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 23:55:41,617 epoch 1 - iter 594/1984 - loss 0.67583760 - time (sec): 26.05 - samples/sec: 1849.23 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 23:55:50,732 epoch 1 - iter 792/1984 - loss 0.55353196 - time (sec): 35.17 - samples/sec: 1835.49 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 23:55:59,824 epoch 1 - iter 990/1984 - loss 0.47277969 - time (sec): 44.26 - samples/sec: 1838.05 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 23:56:08,934 epoch 1 - iter 1188/1984 - loss 0.41000215 - time (sec): 53.37 - samples/sec: 1862.91 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 23:56:17,850 epoch 1 - iter 1386/1984 - loss 0.37433987 - time (sec): 62.28 - samples/sec: 1853.85 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 23:56:27,010 epoch 1 - iter 1584/1984 - loss 0.34567962 - time (sec): 71.45 - samples/sec: 1845.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 23:56:36,022 epoch 1 - iter 1782/1984 - loss 0.32457592 - time (sec): 80.46 - samples/sec: 1832.92 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 23:56:44,992 epoch 1 - iter 1980/1984 - loss 0.30739042 - time (sec): 89.43 - samples/sec: 1829.76 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 23:56:45,174 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:56:45,174 EPOCH 1 done: loss 0.3074 - lr: 0.000050 |
|
2023-10-13 23:56:48,815 DEV : loss 0.1368168443441391 - f1-score (micro avg) 0.6857 |
|
2023-10-13 23:56:48,836 saving best model |
|
2023-10-13 23:56:49,275 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:56:58,249 epoch 2 - iter 198/1984 - loss 0.12279295 - time (sec): 8.97 - samples/sec: 1784.86 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 23:57:07,264 epoch 2 - iter 396/1984 - loss 0.11883519 - time (sec): 17.99 - samples/sec: 1808.64 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 23:57:16,299 epoch 2 - iter 594/1984 - loss 0.12767893 - time (sec): 27.02 - samples/sec: 1811.49 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 23:57:25,354 epoch 2 - iter 792/1984 - loss 0.12653949 - time (sec): 36.08 - samples/sec: 1815.08 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 23:57:34,318 epoch 2 - iter 990/1984 - loss 0.12528597 - time (sec): 45.04 - samples/sec: 1818.90 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 23:57:43,288 epoch 2 - iter 1188/1984 - loss 0.12318635 - time (sec): 54.01 - samples/sec: 1823.91 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 23:57:52,593 epoch 2 - iter 1386/1984 - loss 0.12351316 - time (sec): 63.32 - samples/sec: 1813.04 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 23:58:01,570 epoch 2 - iter 1584/1984 - loss 0.12274190 - time (sec): 72.29 - samples/sec: 1811.65 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 23:58:10,839 epoch 2 - iter 1782/1984 - loss 0.11980283 - time (sec): 81.56 - samples/sec: 1809.58 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 23:58:19,789 epoch 2 - iter 1980/1984 - loss 0.11769240 - time (sec): 90.51 - samples/sec: 1808.94 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 23:58:19,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:58:19,967 EPOCH 2 done: loss 0.1178 - lr: 0.000044 |
|
2023-10-13 23:58:23,421 DEV : loss 0.1154676303267479 - f1-score (micro avg) 0.7134 |
|
2023-10-13 23:58:23,442 saving best model |
|
2023-10-13 23:58:23,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:58:33,121 epoch 3 - iter 198/1984 - loss 0.08197092 - time (sec): 9.14 - samples/sec: 1751.59 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 23:58:41,994 epoch 3 - iter 396/1984 - loss 0.08662040 - time (sec): 18.01 - samples/sec: 1770.73 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 23:58:51,072 epoch 3 - iter 594/1984 - loss 0.08983954 - time (sec): 27.09 - samples/sec: 1801.95 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 23:59:00,163 epoch 3 - iter 792/1984 - loss 0.08922528 - time (sec): 36.18 - samples/sec: 1823.58 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 23:59:09,201 epoch 3 - iter 990/1984 - loss 0.09021301 - time (sec): 45.22 - samples/sec: 1817.72 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 23:59:18,138 epoch 3 - iter 1188/1984 - loss 0.09283341 - time (sec): 54.15 - samples/sec: 1810.17 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 23:59:27,086 epoch 3 - iter 1386/1984 - loss 0.09059581 - time (sec): 63.10 - samples/sec: 1810.95 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 23:59:36,138 epoch 3 - iter 1584/1984 - loss 0.08879431 - time (sec): 72.15 - samples/sec: 1817.41 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 23:59:44,813 epoch 3 - iter 1782/1984 - loss 0.08740617 - time (sec): 80.83 - samples/sec: 1824.42 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 23:59:53,461 epoch 3 - iter 1980/1984 - loss 0.08760937 - time (sec): 89.48 - samples/sec: 1830.63 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 23:59:53,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:59:53,633 EPOCH 3 done: loss 0.0877 - lr: 0.000039 |
|
2023-10-13 23:59:57,488 DEV : loss 0.11892345547676086 - f1-score (micro avg) 0.7438 |
|
2023-10-13 23:59:57,509 saving best model |
|
2023-10-13 23:59:58,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:00:07,100 epoch 4 - iter 198/1984 - loss 0.06258502 - time (sec): 9.05 - samples/sec: 1759.68 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 00:00:16,056 epoch 4 - iter 396/1984 - loss 0.06650697 - time (sec): 18.00 - samples/sec: 1818.05 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 00:00:24,941 epoch 4 - iter 594/1984 - loss 0.06754876 - time (sec): 26.89 - samples/sec: 1775.71 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 00:00:33,992 epoch 4 - iter 792/1984 - loss 0.06932127 - time (sec): 35.94 - samples/sec: 1791.07 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 00:00:42,974 epoch 4 - iter 990/1984 - loss 0.06932810 - time (sec): 44.92 - samples/sec: 1802.23 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 00:00:52,014 epoch 4 - iter 1188/1984 - loss 0.07083132 - time (sec): 53.96 - samples/sec: 1813.38 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 00:01:01,168 epoch 4 - iter 1386/1984 - loss 0.07101762 - time (sec): 63.11 - samples/sec: 1805.21 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 00:01:10,193 epoch 4 - iter 1584/1984 - loss 0.07066029 - time (sec): 72.14 - samples/sec: 1799.01 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 00:01:19,160 epoch 4 - iter 1782/1984 - loss 0.07056362 - time (sec): 81.11 - samples/sec: 1804.70 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 00:01:28,184 epoch 4 - iter 1980/1984 - loss 0.06978300 - time (sec): 90.13 - samples/sec: 1816.23 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 00:01:28,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:01:28,365 EPOCH 4 done: loss 0.0697 - lr: 0.000033 |
|
2023-10-14 00:01:31,842 DEV : loss 0.15688975155353546 - f1-score (micro avg) 0.7467 |
|
2023-10-14 00:01:31,863 saving best model |
|
2023-10-14 00:01:32,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:01:41,585 epoch 5 - iter 198/1984 - loss 0.05068501 - time (sec): 9.16 - samples/sec: 1761.62 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 00:01:50,548 epoch 5 - iter 396/1984 - loss 0.05451821 - time (sec): 18.13 - samples/sec: 1811.75 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 00:01:59,521 epoch 5 - iter 594/1984 - loss 0.05001998 - time (sec): 27.10 - samples/sec: 1843.55 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 00:02:08,376 epoch 5 - iter 792/1984 - loss 0.04929968 - time (sec): 35.96 - samples/sec: 1826.24 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-14 00:02:17,340 epoch 5 - iter 990/1984 - loss 0.04965478 - time (sec): 44.92 - samples/sec: 1813.73 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-14 00:02:26,433 epoch 5 - iter 1188/1984 - loss 0.04955068 - time (sec): 54.01 - samples/sec: 1816.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 00:02:35,499 epoch 5 - iter 1386/1984 - loss 0.04994146 - time (sec): 63.08 - samples/sec: 1821.68 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 00:02:44,636 epoch 5 - iter 1584/1984 - loss 0.05157217 - time (sec): 72.21 - samples/sec: 1829.33 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 00:02:53,623 epoch 5 - iter 1782/1984 - loss 0.04967315 - time (sec): 81.20 - samples/sec: 1822.14 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 00:03:02,598 epoch 5 - iter 1980/1984 - loss 0.04984569 - time (sec): 90.18 - samples/sec: 1813.98 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 00:03:02,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:03:02,783 EPOCH 5 done: loss 0.0501 - lr: 0.000028 |
|
2023-10-14 00:03:06,243 DEV : loss 0.18403209745883942 - f1-score (micro avg) 0.7202 |
|
2023-10-14 00:03:06,264 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:03:15,457 epoch 6 - iter 198/1984 - loss 0.04584627 - time (sec): 9.19 - samples/sec: 1898.94 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 00:03:24,449 epoch 6 - iter 396/1984 - loss 0.04186945 - time (sec): 18.18 - samples/sec: 1829.36 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 00:03:33,393 epoch 6 - iter 594/1984 - loss 0.03941596 - time (sec): 27.13 - samples/sec: 1800.46 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 00:03:42,411 epoch 6 - iter 792/1984 - loss 0.03879688 - time (sec): 36.15 - samples/sec: 1808.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 00:03:51,359 epoch 6 - iter 990/1984 - loss 0.03828610 - time (sec): 45.09 - samples/sec: 1799.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 00:04:00,278 epoch 6 - iter 1188/1984 - loss 0.03781164 - time (sec): 54.01 - samples/sec: 1796.84 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 00:04:09,404 epoch 6 - iter 1386/1984 - loss 0.03790898 - time (sec): 63.14 - samples/sec: 1807.80 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 00:04:18,592 epoch 6 - iter 1584/1984 - loss 0.03778105 - time (sec): 72.33 - samples/sec: 1803.97 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 00:04:27,635 epoch 6 - iter 1782/1984 - loss 0.03857277 - time (sec): 81.37 - samples/sec: 1807.22 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 00:04:37,083 epoch 6 - iter 1980/1984 - loss 0.03900797 - time (sec): 90.82 - samples/sec: 1802.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 00:04:37,255 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:04:37,255 EPOCH 6 done: loss 0.0390 - lr: 0.000022 |
|
2023-10-14 00:04:40,657 DEV : loss 0.18327702581882477 - f1-score (micro avg) 0.7536 |
|
2023-10-14 00:04:40,678 saving best model |
|
2023-10-14 00:04:41,211 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:04:50,351 epoch 7 - iter 198/1984 - loss 0.02501832 - time (sec): 9.14 - samples/sec: 1831.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 00:04:59,277 epoch 7 - iter 396/1984 - loss 0.02322471 - time (sec): 18.06 - samples/sec: 1834.84 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 00:05:08,302 epoch 7 - iter 594/1984 - loss 0.02153630 - time (sec): 27.09 - samples/sec: 1838.56 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 00:05:17,224 epoch 7 - iter 792/1984 - loss 0.02269734 - time (sec): 36.01 - samples/sec: 1807.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 00:05:26,210 epoch 7 - iter 990/1984 - loss 0.02477186 - time (sec): 44.99 - samples/sec: 1823.24 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 00:05:35,217 epoch 7 - iter 1188/1984 - loss 0.02497735 - time (sec): 54.00 - samples/sec: 1816.37 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 00:05:44,152 epoch 7 - iter 1386/1984 - loss 0.02400522 - time (sec): 62.94 - samples/sec: 1812.54 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 00:05:53,131 epoch 7 - iter 1584/1984 - loss 0.02529192 - time (sec): 71.91 - samples/sec: 1811.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 00:06:02,211 epoch 7 - iter 1782/1984 - loss 0.02578243 - time (sec): 80.99 - samples/sec: 1813.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 00:06:11,197 epoch 7 - iter 1980/1984 - loss 0.02724683 - time (sec): 89.98 - samples/sec: 1819.80 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 00:06:11,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:06:11,373 EPOCH 7 done: loss 0.0272 - lr: 0.000017 |
|
2023-10-14 00:06:14,886 DEV : loss 0.19809222221374512 - f1-score (micro avg) 0.7694 |
|
2023-10-14 00:06:14,908 saving best model |
|
2023-10-14 00:06:15,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:06:24,684 epoch 8 - iter 198/1984 - loss 0.02503918 - time (sec): 9.23 - samples/sec: 1846.98 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 00:06:33,673 epoch 8 - iter 396/1984 - loss 0.02450807 - time (sec): 18.22 - samples/sec: 1830.88 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 00:06:42,985 epoch 8 - iter 594/1984 - loss 0.02228166 - time (sec): 27.53 - samples/sec: 1833.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 00:06:51,897 epoch 8 - iter 792/1984 - loss 0.02090114 - time (sec): 36.44 - samples/sec: 1831.96 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 00:07:00,918 epoch 8 - iter 990/1984 - loss 0.02167482 - time (sec): 45.46 - samples/sec: 1805.64 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 00:07:10,067 epoch 8 - iter 1188/1984 - loss 0.02111459 - time (sec): 54.61 - samples/sec: 1802.67 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 00:07:19,289 epoch 8 - iter 1386/1984 - loss 0.02001664 - time (sec): 63.83 - samples/sec: 1799.08 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 00:07:28,458 epoch 8 - iter 1584/1984 - loss 0.01937469 - time (sec): 73.00 - samples/sec: 1800.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 00:07:37,598 epoch 8 - iter 1782/1984 - loss 0.01890756 - time (sec): 82.14 - samples/sec: 1803.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 00:07:46,695 epoch 8 - iter 1980/1984 - loss 0.01916413 - time (sec): 91.24 - samples/sec: 1793.56 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 00:07:46,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:07:46,880 EPOCH 8 done: loss 0.0191 - lr: 0.000011 |
|
2023-10-14 00:07:50,764 DEV : loss 0.20701323449611664 - f1-score (micro avg) 0.7508 |
|
2023-10-14 00:07:50,785 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:07:59,690 epoch 9 - iter 198/1984 - loss 0.01683448 - time (sec): 8.90 - samples/sec: 1786.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 00:08:08,744 epoch 9 - iter 396/1984 - loss 0.01424361 - time (sec): 17.96 - samples/sec: 1797.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 00:08:17,764 epoch 9 - iter 594/1984 - loss 0.01329647 - time (sec): 26.98 - samples/sec: 1831.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 00:08:26,715 epoch 9 - iter 792/1984 - loss 0.01361289 - time (sec): 35.93 - samples/sec: 1835.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 00:08:35,693 epoch 9 - iter 990/1984 - loss 0.01248769 - time (sec): 44.91 - samples/sec: 1828.13 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 00:08:44,665 epoch 9 - iter 1188/1984 - loss 0.01303478 - time (sec): 53.88 - samples/sec: 1828.78 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 00:08:53,901 epoch 9 - iter 1386/1984 - loss 0.01248985 - time (sec): 63.12 - samples/sec: 1814.35 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 00:09:02,963 epoch 9 - iter 1584/1984 - loss 0.01249750 - time (sec): 72.18 - samples/sec: 1813.72 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 00:09:12,129 epoch 9 - iter 1782/1984 - loss 0.01229768 - time (sec): 81.34 - samples/sec: 1806.58 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 00:09:21,106 epoch 9 - iter 1980/1984 - loss 0.01212657 - time (sec): 90.32 - samples/sec: 1811.73 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 00:09:21,297 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:09:21,297 EPOCH 9 done: loss 0.0122 - lr: 0.000006 |
|
2023-10-14 00:09:24,760 DEV : loss 0.24803374707698822 - f1-score (micro avg) 0.7598 |
|
2023-10-14 00:09:24,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:09:34,131 epoch 10 - iter 198/1984 - loss 0.01047072 - time (sec): 9.35 - samples/sec: 1835.03 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 00:09:43,100 epoch 10 - iter 396/1984 - loss 0.00852613 - time (sec): 18.32 - samples/sec: 1812.58 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 00:09:52,160 epoch 10 - iter 594/1984 - loss 0.00813734 - time (sec): 27.38 - samples/sec: 1818.87 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 00:10:01,309 epoch 10 - iter 792/1984 - loss 0.00748039 - time (sec): 36.53 - samples/sec: 1827.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 00:10:10,508 epoch 10 - iter 990/1984 - loss 0.00764780 - time (sec): 45.73 - samples/sec: 1837.44 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 00:10:19,392 epoch 10 - iter 1188/1984 - loss 0.00762812 - time (sec): 54.61 - samples/sec: 1832.58 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 00:10:28,228 epoch 10 - iter 1386/1984 - loss 0.00802417 - time (sec): 63.44 - samples/sec: 1820.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 00:10:37,140 epoch 10 - iter 1584/1984 - loss 0.00806189 - time (sec): 72.36 - samples/sec: 1809.06 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 00:10:46,228 epoch 10 - iter 1782/1984 - loss 0.00772889 - time (sec): 81.44 - samples/sec: 1803.08 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 00:10:55,215 epoch 10 - iter 1980/1984 - loss 0.00739086 - time (sec): 90.43 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 00:10:55,397 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:10:55,397 EPOCH 10 done: loss 0.0074 - lr: 0.000000 |
|
2023-10-14 00:10:59,305 DEV : loss 0.24994361400604248 - f1-score (micro avg) 0.7672 |
|
2023-10-14 00:10:59,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 00:10:59,754 Loading model from best epoch ... |
|
2023-10-14 00:11:01,093 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-14 00:11:04,428 |
|
Results: |
|
- F-score (micro) 0.7713 |
|
- F-score (macro) 0.6774 |
|
- Accuracy 0.6506 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8082 0.8427 0.8251 655 |
|
PER 0.6825 0.8386 0.7525 223 |
|
ORG 0.6338 0.3543 0.4545 127 |
|
|
|
micro avg 0.7626 0.7801 0.7713 1005 |
|
macro avg 0.7082 0.6785 0.6774 1005 |
|
weighted avg 0.7583 0.7801 0.7622 1005 |
|
|
|
2023-10-14 00:11:04,428 ---------------------------------------------------------------------------------------------------- |
|
|