stefan-it's picture
Upload folder using huggingface_hub
d72aeda
raw
history blame
23.8 kB
2023-10-13 22:21:32,062 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,063 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 22:21:32,063 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,063 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 22:21:32,063 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,063 Train: 7936 sentences
2023-10-13 22:21:32,063 (train_with_dev=False, train_with_test=False)
2023-10-13 22:21:32,063 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,064 Training Params:
2023-10-13 22:21:32,064 - learning_rate: "3e-05"
2023-10-13 22:21:32,064 - mini_batch_size: "8"
2023-10-13 22:21:32,064 - max_epochs: "10"
2023-10-13 22:21:32,064 - shuffle: "True"
2023-10-13 22:21:32,064 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,064 Plugins:
2023-10-13 22:21:32,064 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 22:21:32,064 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,064 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 22:21:32,064 - metric: "('micro avg', 'f1-score')"
2023-10-13 22:21:32,064 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,064 Computation:
2023-10-13 22:21:32,064 - compute on device: cuda:0
2023-10-13 22:21:32,064 - embedding storage: none
2023-10-13 22:21:32,064 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,064 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 22:21:32,064 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:32,064 ----------------------------------------------------------------------------------------------------
2023-10-13 22:21:38,226 epoch 1 - iter 99/992 - loss 2.33299355 - time (sec): 6.16 - samples/sec: 2821.27 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:21:43,899 epoch 1 - iter 198/992 - loss 1.45654399 - time (sec): 11.83 - samples/sec: 2803.24 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:21:49,704 epoch 1 - iter 297/992 - loss 1.08734806 - time (sec): 17.64 - samples/sec: 2777.39 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:21:55,475 epoch 1 - iter 396/992 - loss 0.87857029 - time (sec): 23.41 - samples/sec: 2773.46 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:22:01,199 epoch 1 - iter 495/992 - loss 0.74066449 - time (sec): 29.13 - samples/sec: 2783.19 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:22:07,235 epoch 1 - iter 594/992 - loss 0.64158469 - time (sec): 35.17 - samples/sec: 2778.31 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:22:12,920 epoch 1 - iter 693/992 - loss 0.57302135 - time (sec): 40.86 - samples/sec: 2782.79 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:22:18,655 epoch 1 - iter 792/992 - loss 0.51792287 - time (sec): 46.59 - samples/sec: 2794.99 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:22:24,750 epoch 1 - iter 891/992 - loss 0.47364366 - time (sec): 52.68 - samples/sec: 2790.38 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:22:30,602 epoch 1 - iter 990/992 - loss 0.43807375 - time (sec): 58.54 - samples/sec: 2793.01 - lr: 0.000030 - momentum: 0.000000
2023-10-13 22:22:30,750 ----------------------------------------------------------------------------------------------------
2023-10-13 22:22:30,750 EPOCH 1 done: loss 0.4368 - lr: 0.000030
2023-10-13 22:22:33,919 DEV : loss 0.09895769506692886 - f1-score (micro avg) 0.6992
2023-10-13 22:22:33,940 saving best model
2023-10-13 22:22:34,356 ----------------------------------------------------------------------------------------------------
2023-10-13 22:22:40,079 epoch 2 - iter 99/992 - loss 0.10440393 - time (sec): 5.72 - samples/sec: 2716.33 - lr: 0.000030 - momentum: 0.000000
2023-10-13 22:22:45,944 epoch 2 - iter 198/992 - loss 0.10904220 - time (sec): 11.59 - samples/sec: 2709.89 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:22:52,058 epoch 2 - iter 297/992 - loss 0.10445266 - time (sec): 17.70 - samples/sec: 2738.80 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:22:57,812 epoch 2 - iter 396/992 - loss 0.10543564 - time (sec): 23.45 - samples/sec: 2776.19 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:23:03,822 epoch 2 - iter 495/992 - loss 0.10401639 - time (sec): 29.46 - samples/sec: 2761.40 - lr: 0.000028 - momentum: 0.000000
2023-10-13 22:23:09,512 epoch 2 - iter 594/992 - loss 0.10324383 - time (sec): 35.16 - samples/sec: 2769.06 - lr: 0.000028 - momentum: 0.000000
2023-10-13 22:23:15,203 epoch 2 - iter 693/992 - loss 0.10117912 - time (sec): 40.85 - samples/sec: 2773.98 - lr: 0.000028 - momentum: 0.000000
2023-10-13 22:23:21,294 epoch 2 - iter 792/992 - loss 0.10187508 - time (sec): 46.94 - samples/sec: 2770.45 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:23:27,242 epoch 2 - iter 891/992 - loss 0.10186016 - time (sec): 52.89 - samples/sec: 2763.64 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:23:33,244 epoch 2 - iter 990/992 - loss 0.10129833 - time (sec): 58.89 - samples/sec: 2778.25 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:23:33,369 ----------------------------------------------------------------------------------------------------
2023-10-13 22:23:33,370 EPOCH 2 done: loss 0.1012 - lr: 0.000027
2023-10-13 22:23:36,856 DEV : loss 0.09443922340869904 - f1-score (micro avg) 0.7469
2023-10-13 22:23:36,889 saving best model
2023-10-13 22:23:37,421 ----------------------------------------------------------------------------------------------------
2023-10-13 22:23:44,041 epoch 3 - iter 99/992 - loss 0.07835091 - time (sec): 6.62 - samples/sec: 2418.55 - lr: 0.000026 - momentum: 0.000000
2023-10-13 22:23:49,988 epoch 3 - iter 198/992 - loss 0.07452825 - time (sec): 12.57 - samples/sec: 2565.06 - lr: 0.000026 - momentum: 0.000000
2023-10-13 22:23:55,653 epoch 3 - iter 297/992 - loss 0.06912775 - time (sec): 18.23 - samples/sec: 2646.29 - lr: 0.000026 - momentum: 0.000000
2023-10-13 22:24:01,450 epoch 3 - iter 396/992 - loss 0.07019004 - time (sec): 24.03 - samples/sec: 2688.30 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:24:07,403 epoch 3 - iter 495/992 - loss 0.06908856 - time (sec): 29.98 - samples/sec: 2698.01 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:24:13,363 epoch 3 - iter 594/992 - loss 0.06956091 - time (sec): 35.94 - samples/sec: 2710.32 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:24:19,337 epoch 3 - iter 693/992 - loss 0.07164602 - time (sec): 41.91 - samples/sec: 2708.74 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:24:25,264 epoch 3 - iter 792/992 - loss 0.07204638 - time (sec): 47.84 - samples/sec: 2726.73 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:24:31,199 epoch 3 - iter 891/992 - loss 0.07025467 - time (sec): 53.78 - samples/sec: 2740.60 - lr: 0.000024 - momentum: 0.000000
2023-10-13 22:24:37,045 epoch 3 - iter 990/992 - loss 0.06972650 - time (sec): 59.62 - samples/sec: 2746.36 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:24:37,147 ----------------------------------------------------------------------------------------------------
2023-10-13 22:24:37,147 EPOCH 3 done: loss 0.0697 - lr: 0.000023
2023-10-13 22:24:40,552 DEV : loss 0.10710007697343826 - f1-score (micro avg) 0.7477
2023-10-13 22:24:40,575 saving best model
2023-10-13 22:24:41,077 ----------------------------------------------------------------------------------------------------
2023-10-13 22:24:46,925 epoch 4 - iter 99/992 - loss 0.04664656 - time (sec): 5.84 - samples/sec: 2741.14 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:24:53,063 epoch 4 - iter 198/992 - loss 0.04715979 - time (sec): 11.98 - samples/sec: 2768.98 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:24:59,015 epoch 4 - iter 297/992 - loss 0.04504200 - time (sec): 17.93 - samples/sec: 2753.00 - lr: 0.000022 - momentum: 0.000000
2023-10-13 22:25:04,781 epoch 4 - iter 396/992 - loss 0.04552468 - time (sec): 23.70 - samples/sec: 2777.07 - lr: 0.000022 - momentum: 0.000000
2023-10-13 22:25:10,615 epoch 4 - iter 495/992 - loss 0.04550867 - time (sec): 29.53 - samples/sec: 2791.41 - lr: 0.000022 - momentum: 0.000000
2023-10-13 22:25:16,451 epoch 4 - iter 594/992 - loss 0.04745191 - time (sec): 35.37 - samples/sec: 2789.76 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:25:22,220 epoch 4 - iter 693/992 - loss 0.04806225 - time (sec): 41.14 - samples/sec: 2786.19 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:25:28,218 epoch 4 - iter 792/992 - loss 0.04807101 - time (sec): 47.14 - samples/sec: 2781.22 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:25:33,783 epoch 4 - iter 891/992 - loss 0.04884662 - time (sec): 52.70 - samples/sec: 2795.26 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:25:39,847 epoch 4 - iter 990/992 - loss 0.04904023 - time (sec): 58.77 - samples/sec: 2783.19 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:25:39,971 ----------------------------------------------------------------------------------------------------
2023-10-13 22:25:39,972 EPOCH 4 done: loss 0.0491 - lr: 0.000020
2023-10-13 22:25:43,378 DEV : loss 0.1333416849374771 - f1-score (micro avg) 0.7489
2023-10-13 22:25:43,398 saving best model
2023-10-13 22:25:44,266 ----------------------------------------------------------------------------------------------------
2023-10-13 22:25:50,041 epoch 5 - iter 99/992 - loss 0.03508164 - time (sec): 5.77 - samples/sec: 2761.94 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:25:55,794 epoch 5 - iter 198/992 - loss 0.03490154 - time (sec): 11.52 - samples/sec: 2811.59 - lr: 0.000019 - momentum: 0.000000
2023-10-13 22:26:01,520 epoch 5 - iter 297/992 - loss 0.03554118 - time (sec): 17.25 - samples/sec: 2835.25 - lr: 0.000019 - momentum: 0.000000
2023-10-13 22:26:07,492 epoch 5 - iter 396/992 - loss 0.03723271 - time (sec): 23.22 - samples/sec: 2830.39 - lr: 0.000019 - momentum: 0.000000
2023-10-13 22:26:13,620 epoch 5 - iter 495/992 - loss 0.03770116 - time (sec): 29.35 - samples/sec: 2814.86 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:26:19,504 epoch 5 - iter 594/992 - loss 0.03766517 - time (sec): 35.23 - samples/sec: 2797.30 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:26:25,285 epoch 5 - iter 693/992 - loss 0.03805744 - time (sec): 41.01 - samples/sec: 2792.03 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:26:31,050 epoch 5 - iter 792/992 - loss 0.03684000 - time (sec): 46.78 - samples/sec: 2788.99 - lr: 0.000017 - momentum: 0.000000
2023-10-13 22:26:37,088 epoch 5 - iter 891/992 - loss 0.03834438 - time (sec): 52.82 - samples/sec: 2790.28 - lr: 0.000017 - momentum: 0.000000
2023-10-13 22:26:42,739 epoch 5 - iter 990/992 - loss 0.03863747 - time (sec): 58.47 - samples/sec: 2796.55 - lr: 0.000017 - momentum: 0.000000
2023-10-13 22:26:42,895 ----------------------------------------------------------------------------------------------------
2023-10-13 22:26:42,895 EPOCH 5 done: loss 0.0386 - lr: 0.000017
2023-10-13 22:26:46,395 DEV : loss 0.1474953144788742 - f1-score (micro avg) 0.7697
2023-10-13 22:26:46,416 saving best model
2023-10-13 22:26:46,900 ----------------------------------------------------------------------------------------------------
2023-10-13 22:26:52,655 epoch 6 - iter 99/992 - loss 0.03066144 - time (sec): 5.75 - samples/sec: 2870.08 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:26:58,650 epoch 6 - iter 198/992 - loss 0.02576809 - time (sec): 11.75 - samples/sec: 2842.92 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:27:04,479 epoch 6 - iter 297/992 - loss 0.02640387 - time (sec): 17.58 - samples/sec: 2865.75 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:27:10,420 epoch 6 - iter 396/992 - loss 0.02662851 - time (sec): 23.52 - samples/sec: 2839.78 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:27:16,188 epoch 6 - iter 495/992 - loss 0.02679061 - time (sec): 29.28 - samples/sec: 2812.92 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:27:22,018 epoch 6 - iter 594/992 - loss 0.02779911 - time (sec): 35.12 - samples/sec: 2804.97 - lr: 0.000015 - momentum: 0.000000
2023-10-13 22:27:27,615 epoch 6 - iter 693/992 - loss 0.02793110 - time (sec): 40.71 - samples/sec: 2789.84 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:27:33,725 epoch 6 - iter 792/992 - loss 0.02849861 - time (sec): 46.82 - samples/sec: 2800.86 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:27:39,500 epoch 6 - iter 891/992 - loss 0.03003150 - time (sec): 52.60 - samples/sec: 2796.36 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:27:45,315 epoch 6 - iter 990/992 - loss 0.02931730 - time (sec): 58.41 - samples/sec: 2799.63 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:27:45,439 ----------------------------------------------------------------------------------------------------
2023-10-13 22:27:45,439 EPOCH 6 done: loss 0.0294 - lr: 0.000013
2023-10-13 22:27:48,830 DEV : loss 0.18882010877132416 - f1-score (micro avg) 0.7443
2023-10-13 22:27:48,850 ----------------------------------------------------------------------------------------------------
2023-10-13 22:27:54,585 epoch 7 - iter 99/992 - loss 0.01843936 - time (sec): 5.73 - samples/sec: 2830.51 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:28:00,478 epoch 7 - iter 198/992 - loss 0.01884112 - time (sec): 11.63 - samples/sec: 2775.82 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:28:07,139 epoch 7 - iter 297/992 - loss 0.02033178 - time (sec): 18.29 - samples/sec: 2712.51 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:28:13,151 epoch 7 - iter 396/992 - loss 0.02087243 - time (sec): 24.30 - samples/sec: 2732.06 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:28:19,012 epoch 7 - iter 495/992 - loss 0.02087418 - time (sec): 30.16 - samples/sec: 2747.66 - lr: 0.000012 - momentum: 0.000000
2023-10-13 22:28:24,686 epoch 7 - iter 594/992 - loss 0.02171817 - time (sec): 35.83 - samples/sec: 2751.03 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:28:30,346 epoch 7 - iter 693/992 - loss 0.02240224 - time (sec): 41.49 - samples/sec: 2767.35 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:28:36,072 epoch 7 - iter 792/992 - loss 0.02183838 - time (sec): 47.22 - samples/sec: 2779.85 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:28:41,868 epoch 7 - iter 891/992 - loss 0.02200465 - time (sec): 53.02 - samples/sec: 2773.88 - lr: 0.000010 - momentum: 0.000000
2023-10-13 22:28:47,905 epoch 7 - iter 990/992 - loss 0.02214401 - time (sec): 59.05 - samples/sec: 2773.37 - lr: 0.000010 - momentum: 0.000000
2023-10-13 22:28:48,003 ----------------------------------------------------------------------------------------------------
2023-10-13 22:28:48,003 EPOCH 7 done: loss 0.0221 - lr: 0.000010
2023-10-13 22:28:51,462 DEV : loss 0.18745747208595276 - f1-score (micro avg) 0.7648
2023-10-13 22:28:51,483 ----------------------------------------------------------------------------------------------------
2023-10-13 22:28:57,625 epoch 8 - iter 99/992 - loss 0.01733595 - time (sec): 6.14 - samples/sec: 2687.32 - lr: 0.000010 - momentum: 0.000000
2023-10-13 22:29:03,732 epoch 8 - iter 198/992 - loss 0.01451052 - time (sec): 12.25 - samples/sec: 2732.96 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:29:09,698 epoch 8 - iter 297/992 - loss 0.01458980 - time (sec): 18.21 - samples/sec: 2747.88 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:29:15,627 epoch 8 - iter 396/992 - loss 0.01434329 - time (sec): 24.14 - samples/sec: 2786.27 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:29:21,451 epoch 8 - iter 495/992 - loss 0.01589815 - time (sec): 29.97 - samples/sec: 2787.43 - lr: 0.000008 - momentum: 0.000000
2023-10-13 22:29:27,300 epoch 8 - iter 594/992 - loss 0.01547773 - time (sec): 35.82 - samples/sec: 2772.84 - lr: 0.000008 - momentum: 0.000000
2023-10-13 22:29:33,240 epoch 8 - iter 693/992 - loss 0.01546711 - time (sec): 41.76 - samples/sec: 2752.58 - lr: 0.000008 - momentum: 0.000000
2023-10-13 22:29:39,317 epoch 8 - iter 792/992 - loss 0.01554967 - time (sec): 47.83 - samples/sec: 2748.23 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:29:45,154 epoch 8 - iter 891/992 - loss 0.01624295 - time (sec): 53.67 - samples/sec: 2751.29 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:29:51,001 epoch 8 - iter 990/992 - loss 0.01584456 - time (sec): 59.52 - samples/sec: 2752.74 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:29:51,101 ----------------------------------------------------------------------------------------------------
2023-10-13 22:29:51,101 EPOCH 8 done: loss 0.0160 - lr: 0.000007
2023-10-13 22:29:54,567 DEV : loss 0.19708073139190674 - f1-score (micro avg) 0.7606
2023-10-13 22:29:54,588 ----------------------------------------------------------------------------------------------------
2023-10-13 22:30:00,193 epoch 9 - iter 99/992 - loss 0.00851250 - time (sec): 5.60 - samples/sec: 2638.38 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:30:05,871 epoch 9 - iter 198/992 - loss 0.01387637 - time (sec): 11.28 - samples/sec: 2720.77 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:30:11,835 epoch 9 - iter 297/992 - loss 0.01335263 - time (sec): 17.25 - samples/sec: 2715.07 - lr: 0.000006 - momentum: 0.000000
2023-10-13 22:30:17,685 epoch 9 - iter 396/992 - loss 0.01153281 - time (sec): 23.10 - samples/sec: 2735.71 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:30:23,804 epoch 9 - iter 495/992 - loss 0.01105092 - time (sec): 29.21 - samples/sec: 2740.26 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:30:29,756 epoch 9 - iter 594/992 - loss 0.01139772 - time (sec): 35.17 - samples/sec: 2748.28 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:30:35,469 epoch 9 - iter 693/992 - loss 0.01125951 - time (sec): 40.88 - samples/sec: 2762.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:30:41,410 epoch 9 - iter 792/992 - loss 0.01144688 - time (sec): 46.82 - samples/sec: 2773.64 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:30:47,347 epoch 9 - iter 891/992 - loss 0.01171083 - time (sec): 52.76 - samples/sec: 2778.18 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:30:53,864 epoch 9 - iter 990/992 - loss 0.01160068 - time (sec): 59.27 - samples/sec: 2759.87 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:30:53,972 ----------------------------------------------------------------------------------------------------
2023-10-13 22:30:53,972 EPOCH 9 done: loss 0.0116 - lr: 0.000003
2023-10-13 22:30:57,399 DEV : loss 0.21158859133720398 - f1-score (micro avg) 0.7642
2023-10-13 22:30:57,420 ----------------------------------------------------------------------------------------------------
2023-10-13 22:31:03,143 epoch 10 - iter 99/992 - loss 0.01294454 - time (sec): 5.72 - samples/sec: 2871.83 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:31:09,337 epoch 10 - iter 198/992 - loss 0.01174616 - time (sec): 11.92 - samples/sec: 2875.49 - lr: 0.000003 - momentum: 0.000000
2023-10-13 22:31:14,913 epoch 10 - iter 297/992 - loss 0.01070740 - time (sec): 17.49 - samples/sec: 2844.70 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:31:20,657 epoch 10 - iter 396/992 - loss 0.01045205 - time (sec): 23.24 - samples/sec: 2850.69 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:31:26,862 epoch 10 - iter 495/992 - loss 0.01098101 - time (sec): 29.44 - samples/sec: 2830.26 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:31:32,596 epoch 10 - iter 594/992 - loss 0.01073008 - time (sec): 35.18 - samples/sec: 2824.57 - lr: 0.000001 - momentum: 0.000000
2023-10-13 22:31:38,266 epoch 10 - iter 693/992 - loss 0.01015657 - time (sec): 40.85 - samples/sec: 2806.63 - lr: 0.000001 - momentum: 0.000000
2023-10-13 22:31:44,347 epoch 10 - iter 792/992 - loss 0.00973608 - time (sec): 46.93 - samples/sec: 2792.56 - lr: 0.000001 - momentum: 0.000000
2023-10-13 22:31:50,396 epoch 10 - iter 891/992 - loss 0.00928079 - time (sec): 52.98 - samples/sec: 2788.54 - lr: 0.000000 - momentum: 0.000000
2023-10-13 22:31:56,043 epoch 10 - iter 990/992 - loss 0.00889257 - time (sec): 58.62 - samples/sec: 2792.15 - lr: 0.000000 - momentum: 0.000000
2023-10-13 22:31:56,156 ----------------------------------------------------------------------------------------------------
2023-10-13 22:31:56,156 EPOCH 10 done: loss 0.0089 - lr: 0.000000
2023-10-13 22:31:59,648 DEV : loss 0.21550461649894714 - f1-score (micro avg) 0.7694
2023-10-13 22:32:00,091 ----------------------------------------------------------------------------------------------------
2023-10-13 22:32:00,092 Loading model from best epoch ...
2023-10-13 22:32:01,548 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 22:32:04,837
Results:
- F-score (micro) 0.7902
- F-score (macro) 0.7068
- Accuracy 0.6763
By class:
precision recall f1-score support
LOC 0.8268 0.8748 0.8501 655
PER 0.7026 0.8475 0.7683 223
ORG 0.5259 0.4803 0.5021 127
micro avg 0.7635 0.8189 0.7902 1005
macro avg 0.6851 0.7342 0.7068 1005
weighted avg 0.7612 0.8189 0.7880 1005
2023-10-13 22:32:04,837 ----------------------------------------------------------------------------------------------------