2023-10-13 22:21:32,062 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,063 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 22:21:32,063 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,063 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 22:21:32,063 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,063 Train: 7936 sentences 2023-10-13 22:21:32,063 (train_with_dev=False, train_with_test=False) 2023-10-13 22:21:32,063 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,064 Training Params: 2023-10-13 22:21:32,064 - learning_rate: "3e-05" 2023-10-13 22:21:32,064 - mini_batch_size: "8" 2023-10-13 22:21:32,064 - max_epochs: "10" 2023-10-13 22:21:32,064 - shuffle: "True" 2023-10-13 22:21:32,064 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,064 Plugins: 2023-10-13 22:21:32,064 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 22:21:32,064 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,064 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 22:21:32,064 - metric: "('micro avg', 'f1-score')" 2023-10-13 22:21:32,064 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,064 Computation: 2023-10-13 22:21:32,064 - compute on device: cuda:0 2023-10-13 22:21:32,064 - embedding storage: none 2023-10-13 22:21:32,064 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,064 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 22:21:32,064 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:32,064 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:21:38,226 epoch 1 - iter 99/992 - loss 2.33299355 - time (sec): 6.16 - samples/sec: 2821.27 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:21:43,899 epoch 1 - iter 198/992 - loss 1.45654399 - time (sec): 11.83 - samples/sec: 2803.24 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:21:49,704 epoch 1 - iter 297/992 - loss 1.08734806 - time (sec): 17.64 - samples/sec: 2777.39 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:21:55,475 epoch 1 - iter 396/992 - loss 0.87857029 - time (sec): 23.41 - samples/sec: 2773.46 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:22:01,199 epoch 1 - iter 495/992 - loss 0.74066449 - time (sec): 29.13 - samples/sec: 2783.19 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:22:07,235 epoch 1 - iter 594/992 - loss 0.64158469 - time (sec): 35.17 - samples/sec: 2778.31 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:22:12,920 epoch 1 - iter 693/992 - loss 0.57302135 - time (sec): 40.86 - samples/sec: 2782.79 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:22:18,655 epoch 1 - iter 792/992 - loss 0.51792287 - time (sec): 46.59 - samples/sec: 2794.99 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:22:24,750 epoch 1 - iter 891/992 - loss 0.47364366 - time (sec): 52.68 - samples/sec: 2790.38 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:22:30,602 epoch 1 - iter 990/992 - loss 0.43807375 - time (sec): 58.54 - samples/sec: 2793.01 - lr: 0.000030 - momentum: 0.000000 2023-10-13 22:22:30,750 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:22:30,750 EPOCH 1 done: loss 0.4368 - lr: 0.000030 2023-10-13 22:22:33,919 DEV : loss 0.09895769506692886 - f1-score (micro avg) 0.6992 2023-10-13 22:22:33,940 saving best model 2023-10-13 22:22:34,356 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:22:40,079 epoch 2 - iter 99/992 - loss 0.10440393 - time (sec): 5.72 - samples/sec: 2716.33 - lr: 0.000030 - momentum: 0.000000 2023-10-13 22:22:45,944 epoch 2 - iter 198/992 - loss 0.10904220 - time (sec): 11.59 - samples/sec: 2709.89 - lr: 0.000029 - momentum: 0.000000 2023-10-13 22:22:52,058 epoch 2 - iter 297/992 - loss 0.10445266 - time (sec): 17.70 - samples/sec: 2738.80 - lr: 0.000029 - momentum: 0.000000 2023-10-13 22:22:57,812 epoch 2 - iter 396/992 - loss 0.10543564 - time (sec): 23.45 - samples/sec: 2776.19 - lr: 0.000029 - momentum: 0.000000 2023-10-13 22:23:03,822 epoch 2 - iter 495/992 - loss 0.10401639 - time (sec): 29.46 - samples/sec: 2761.40 - lr: 0.000028 - momentum: 0.000000 2023-10-13 22:23:09,512 epoch 2 - iter 594/992 - loss 0.10324383 - time (sec): 35.16 - samples/sec: 2769.06 - lr: 0.000028 - momentum: 0.000000 2023-10-13 22:23:15,203 epoch 2 - iter 693/992 - loss 0.10117912 - time (sec): 40.85 - samples/sec: 2773.98 - lr: 0.000028 - momentum: 0.000000 2023-10-13 22:23:21,294 epoch 2 - iter 792/992 - loss 0.10187508 - time (sec): 46.94 - samples/sec: 2770.45 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:23:27,242 epoch 2 - iter 891/992 - loss 0.10186016 - time (sec): 52.89 - samples/sec: 2763.64 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:23:33,244 epoch 2 - iter 990/992 - loss 0.10129833 - time (sec): 58.89 - samples/sec: 2778.25 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:23:33,369 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:23:33,370 EPOCH 2 done: loss 0.1012 - lr: 0.000027 2023-10-13 22:23:36,856 DEV : loss 0.09443922340869904 - f1-score (micro avg) 0.7469 2023-10-13 22:23:36,889 saving best model 2023-10-13 22:23:37,421 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:23:44,041 epoch 3 - iter 99/992 - loss 0.07835091 - time (sec): 6.62 - samples/sec: 2418.55 - lr: 0.000026 - momentum: 0.000000 2023-10-13 22:23:49,988 epoch 3 - iter 198/992 - loss 0.07452825 - time (sec): 12.57 - samples/sec: 2565.06 - lr: 0.000026 - momentum: 0.000000 2023-10-13 22:23:55,653 epoch 3 - iter 297/992 - loss 0.06912775 - time (sec): 18.23 - samples/sec: 2646.29 - lr: 0.000026 - momentum: 0.000000 2023-10-13 22:24:01,450 epoch 3 - iter 396/992 - loss 0.07019004 - time (sec): 24.03 - samples/sec: 2688.30 - lr: 0.000025 - momentum: 0.000000 2023-10-13 22:24:07,403 epoch 3 - iter 495/992 - loss 0.06908856 - time (sec): 29.98 - samples/sec: 2698.01 - lr: 0.000025 - momentum: 0.000000 2023-10-13 22:24:13,363 epoch 3 - iter 594/992 - loss 0.06956091 - time (sec): 35.94 - samples/sec: 2710.32 - lr: 0.000025 - momentum: 0.000000 2023-10-13 22:24:19,337 epoch 3 - iter 693/992 - loss 0.07164602 - time (sec): 41.91 - samples/sec: 2708.74 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:24:25,264 epoch 3 - iter 792/992 - loss 0.07204638 - time (sec): 47.84 - samples/sec: 2726.73 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:24:31,199 epoch 3 - iter 891/992 - loss 0.07025467 - time (sec): 53.78 - samples/sec: 2740.60 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:24:37,045 epoch 3 - iter 990/992 - loss 0.06972650 - time (sec): 59.62 - samples/sec: 2746.36 - lr: 0.000023 - momentum: 0.000000 2023-10-13 22:24:37,147 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:24:37,147 EPOCH 3 done: loss 0.0697 - lr: 0.000023 2023-10-13 22:24:40,552 DEV : loss 0.10710007697343826 - f1-score (micro avg) 0.7477 2023-10-13 22:24:40,575 saving best model 2023-10-13 22:24:41,077 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:24:46,925 epoch 4 - iter 99/992 - loss 0.04664656 - time (sec): 5.84 - samples/sec: 2741.14 - lr: 0.000023 - momentum: 0.000000 2023-10-13 22:24:53,063 epoch 4 - iter 198/992 - loss 0.04715979 - time (sec): 11.98 - samples/sec: 2768.98 - lr: 0.000023 - momentum: 0.000000 2023-10-13 22:24:59,015 epoch 4 - iter 297/992 - loss 0.04504200 - time (sec): 17.93 - samples/sec: 2753.00 - lr: 0.000022 - momentum: 0.000000 2023-10-13 22:25:04,781 epoch 4 - iter 396/992 - loss 0.04552468 - time (sec): 23.70 - samples/sec: 2777.07 - lr: 0.000022 - momentum: 0.000000 2023-10-13 22:25:10,615 epoch 4 - iter 495/992 - loss 0.04550867 - time (sec): 29.53 - samples/sec: 2791.41 - lr: 0.000022 - momentum: 0.000000 2023-10-13 22:25:16,451 epoch 4 - iter 594/992 - loss 0.04745191 - time (sec): 35.37 - samples/sec: 2789.76 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:25:22,220 epoch 4 - iter 693/992 - loss 0.04806225 - time (sec): 41.14 - samples/sec: 2786.19 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:25:28,218 epoch 4 - iter 792/992 - loss 0.04807101 - time (sec): 47.14 - samples/sec: 2781.22 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:25:33,783 epoch 4 - iter 891/992 - loss 0.04884662 - time (sec): 52.70 - samples/sec: 2795.26 - lr: 0.000020 - momentum: 0.000000 2023-10-13 22:25:39,847 epoch 4 - iter 990/992 - loss 0.04904023 - time (sec): 58.77 - samples/sec: 2783.19 - lr: 0.000020 - momentum: 0.000000 2023-10-13 22:25:39,971 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:25:39,972 EPOCH 4 done: loss 0.0491 - lr: 0.000020 2023-10-13 22:25:43,378 DEV : loss 0.1333416849374771 - f1-score (micro avg) 0.7489 2023-10-13 22:25:43,398 saving best model 2023-10-13 22:25:44,266 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:25:50,041 epoch 5 - iter 99/992 - loss 0.03508164 - time (sec): 5.77 - samples/sec: 2761.94 - lr: 0.000020 - momentum: 0.000000 2023-10-13 22:25:55,794 epoch 5 - iter 198/992 - loss 0.03490154 - time (sec): 11.52 - samples/sec: 2811.59 - lr: 0.000019 - momentum: 0.000000 2023-10-13 22:26:01,520 epoch 5 - iter 297/992 - loss 0.03554118 - time (sec): 17.25 - samples/sec: 2835.25 - lr: 0.000019 - momentum: 0.000000 2023-10-13 22:26:07,492 epoch 5 - iter 396/992 - loss 0.03723271 - time (sec): 23.22 - samples/sec: 2830.39 - lr: 0.000019 - momentum: 0.000000 2023-10-13 22:26:13,620 epoch 5 - iter 495/992 - loss 0.03770116 - time (sec): 29.35 - samples/sec: 2814.86 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:26:19,504 epoch 5 - iter 594/992 - loss 0.03766517 - time (sec): 35.23 - samples/sec: 2797.30 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:26:25,285 epoch 5 - iter 693/992 - loss 0.03805744 - time (sec): 41.01 - samples/sec: 2792.03 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:26:31,050 epoch 5 - iter 792/992 - loss 0.03684000 - time (sec): 46.78 - samples/sec: 2788.99 - lr: 0.000017 - momentum: 0.000000 2023-10-13 22:26:37,088 epoch 5 - iter 891/992 - loss 0.03834438 - time (sec): 52.82 - samples/sec: 2790.28 - lr: 0.000017 - momentum: 0.000000 2023-10-13 22:26:42,739 epoch 5 - iter 990/992 - loss 0.03863747 - time (sec): 58.47 - samples/sec: 2796.55 - lr: 0.000017 - momentum: 0.000000 2023-10-13 22:26:42,895 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:26:42,895 EPOCH 5 done: loss 0.0386 - lr: 0.000017 2023-10-13 22:26:46,395 DEV : loss 0.1474953144788742 - f1-score (micro avg) 0.7697 2023-10-13 22:26:46,416 saving best model 2023-10-13 22:26:46,900 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:26:52,655 epoch 6 - iter 99/992 - loss 0.03066144 - time (sec): 5.75 - samples/sec: 2870.08 - lr: 0.000016 - momentum: 0.000000 2023-10-13 22:26:58,650 epoch 6 - iter 198/992 - loss 0.02576809 - time (sec): 11.75 - samples/sec: 2842.92 - lr: 0.000016 - momentum: 0.000000 2023-10-13 22:27:04,479 epoch 6 - iter 297/992 - loss 0.02640387 - time (sec): 17.58 - samples/sec: 2865.75 - lr: 0.000016 - momentum: 0.000000 2023-10-13 22:27:10,420 epoch 6 - iter 396/992 - loss 0.02662851 - time (sec): 23.52 - samples/sec: 2839.78 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:27:16,188 epoch 6 - iter 495/992 - loss 0.02679061 - time (sec): 29.28 - samples/sec: 2812.92 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:27:22,018 epoch 6 - iter 594/992 - loss 0.02779911 - time (sec): 35.12 - samples/sec: 2804.97 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:27:27,615 epoch 6 - iter 693/992 - loss 0.02793110 - time (sec): 40.71 - samples/sec: 2789.84 - lr: 0.000014 - momentum: 0.000000 2023-10-13 22:27:33,725 epoch 6 - iter 792/992 - loss 0.02849861 - time (sec): 46.82 - samples/sec: 2800.86 - lr: 0.000014 - momentum: 0.000000 2023-10-13 22:27:39,500 epoch 6 - iter 891/992 - loss 0.03003150 - time (sec): 52.60 - samples/sec: 2796.36 - lr: 0.000014 - momentum: 0.000000 2023-10-13 22:27:45,315 epoch 6 - iter 990/992 - loss 0.02931730 - time (sec): 58.41 - samples/sec: 2799.63 - lr: 0.000013 - momentum: 0.000000 2023-10-13 22:27:45,439 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:27:45,439 EPOCH 6 done: loss 0.0294 - lr: 0.000013 2023-10-13 22:27:48,830 DEV : loss 0.18882010877132416 - f1-score (micro avg) 0.7443 2023-10-13 22:27:48,850 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:27:54,585 epoch 7 - iter 99/992 - loss 0.01843936 - time (sec): 5.73 - samples/sec: 2830.51 - lr: 0.000013 - momentum: 0.000000 2023-10-13 22:28:00,478 epoch 7 - iter 198/992 - loss 0.01884112 - time (sec): 11.63 - samples/sec: 2775.82 - lr: 0.000013 - momentum: 0.000000 2023-10-13 22:28:07,139 epoch 7 - iter 297/992 - loss 0.02033178 - time (sec): 18.29 - samples/sec: 2712.51 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:28:13,151 epoch 7 - iter 396/992 - loss 0.02087243 - time (sec): 24.30 - samples/sec: 2732.06 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:28:19,012 epoch 7 - iter 495/992 - loss 0.02087418 - time (sec): 30.16 - samples/sec: 2747.66 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:28:24,686 epoch 7 - iter 594/992 - loss 0.02171817 - time (sec): 35.83 - samples/sec: 2751.03 - lr: 0.000011 - momentum: 0.000000 2023-10-13 22:28:30,346 epoch 7 - iter 693/992 - loss 0.02240224 - time (sec): 41.49 - samples/sec: 2767.35 - lr: 0.000011 - momentum: 0.000000 2023-10-13 22:28:36,072 epoch 7 - iter 792/992 - loss 0.02183838 - time (sec): 47.22 - samples/sec: 2779.85 - lr: 0.000011 - momentum: 0.000000 2023-10-13 22:28:41,868 epoch 7 - iter 891/992 - loss 0.02200465 - time (sec): 53.02 - samples/sec: 2773.88 - lr: 0.000010 - momentum: 0.000000 2023-10-13 22:28:47,905 epoch 7 - iter 990/992 - loss 0.02214401 - time (sec): 59.05 - samples/sec: 2773.37 - lr: 0.000010 - momentum: 0.000000 2023-10-13 22:28:48,003 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:28:48,003 EPOCH 7 done: loss 0.0221 - lr: 0.000010 2023-10-13 22:28:51,462 DEV : loss 0.18745747208595276 - f1-score (micro avg) 0.7648 2023-10-13 22:28:51,483 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:28:57,625 epoch 8 - iter 99/992 - loss 0.01733595 - time (sec): 6.14 - samples/sec: 2687.32 - lr: 0.000010 - momentum: 0.000000 2023-10-13 22:29:03,732 epoch 8 - iter 198/992 - loss 0.01451052 - time (sec): 12.25 - samples/sec: 2732.96 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:29:09,698 epoch 8 - iter 297/992 - loss 0.01458980 - time (sec): 18.21 - samples/sec: 2747.88 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:29:15,627 epoch 8 - iter 396/992 - loss 0.01434329 - time (sec): 24.14 - samples/sec: 2786.27 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:29:21,451 epoch 8 - iter 495/992 - loss 0.01589815 - time (sec): 29.97 - samples/sec: 2787.43 - lr: 0.000008 - momentum: 0.000000 2023-10-13 22:29:27,300 epoch 8 - iter 594/992 - loss 0.01547773 - time (sec): 35.82 - samples/sec: 2772.84 - lr: 0.000008 - momentum: 0.000000 2023-10-13 22:29:33,240 epoch 8 - iter 693/992 - loss 0.01546711 - time (sec): 41.76 - samples/sec: 2752.58 - lr: 0.000008 - momentum: 0.000000 2023-10-13 22:29:39,317 epoch 8 - iter 792/992 - loss 0.01554967 - time (sec): 47.83 - samples/sec: 2748.23 - lr: 0.000007 - momentum: 0.000000 2023-10-13 22:29:45,154 epoch 8 - iter 891/992 - loss 0.01624295 - time (sec): 53.67 - samples/sec: 2751.29 - lr: 0.000007 - momentum: 0.000000 2023-10-13 22:29:51,001 epoch 8 - iter 990/992 - loss 0.01584456 - time (sec): 59.52 - samples/sec: 2752.74 - lr: 0.000007 - momentum: 0.000000 2023-10-13 22:29:51,101 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:29:51,101 EPOCH 8 done: loss 0.0160 - lr: 0.000007 2023-10-13 22:29:54,567 DEV : loss 0.19708073139190674 - f1-score (micro avg) 0.7606 2023-10-13 22:29:54,588 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:30:00,193 epoch 9 - iter 99/992 - loss 0.00851250 - time (sec): 5.60 - samples/sec: 2638.38 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:30:05,871 epoch 9 - iter 198/992 - loss 0.01387637 - time (sec): 11.28 - samples/sec: 2720.77 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:30:11,835 epoch 9 - iter 297/992 - loss 0.01335263 - time (sec): 17.25 - samples/sec: 2715.07 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:30:17,685 epoch 9 - iter 396/992 - loss 0.01153281 - time (sec): 23.10 - samples/sec: 2735.71 - lr: 0.000005 - momentum: 0.000000 2023-10-13 22:30:23,804 epoch 9 - iter 495/992 - loss 0.01105092 - time (sec): 29.21 - samples/sec: 2740.26 - lr: 0.000005 - momentum: 0.000000 2023-10-13 22:30:29,756 epoch 9 - iter 594/992 - loss 0.01139772 - time (sec): 35.17 - samples/sec: 2748.28 - lr: 0.000005 - momentum: 0.000000 2023-10-13 22:30:35,469 epoch 9 - iter 693/992 - loss 0.01125951 - time (sec): 40.88 - samples/sec: 2762.21 - lr: 0.000004 - momentum: 0.000000 2023-10-13 22:30:41,410 epoch 9 - iter 792/992 - loss 0.01144688 - time (sec): 46.82 - samples/sec: 2773.64 - lr: 0.000004 - momentum: 0.000000 2023-10-13 22:30:47,347 epoch 9 - iter 891/992 - loss 0.01171083 - time (sec): 52.76 - samples/sec: 2778.18 - lr: 0.000004 - momentum: 0.000000 2023-10-13 22:30:53,864 epoch 9 - iter 990/992 - loss 0.01160068 - time (sec): 59.27 - samples/sec: 2759.87 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:30:53,972 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:30:53,972 EPOCH 9 done: loss 0.0116 - lr: 0.000003 2023-10-13 22:30:57,399 DEV : loss 0.21158859133720398 - f1-score (micro avg) 0.7642 2023-10-13 22:30:57,420 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:31:03,143 epoch 10 - iter 99/992 - loss 0.01294454 - time (sec): 5.72 - samples/sec: 2871.83 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:31:09,337 epoch 10 - iter 198/992 - loss 0.01174616 - time (sec): 11.92 - samples/sec: 2875.49 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:31:14,913 epoch 10 - iter 297/992 - loss 0.01070740 - time (sec): 17.49 - samples/sec: 2844.70 - lr: 0.000002 - momentum: 0.000000 2023-10-13 22:31:20,657 epoch 10 - iter 396/992 - loss 0.01045205 - time (sec): 23.24 - samples/sec: 2850.69 - lr: 0.000002 - momentum: 0.000000 2023-10-13 22:31:26,862 epoch 10 - iter 495/992 - loss 0.01098101 - time (sec): 29.44 - samples/sec: 2830.26 - lr: 0.000002 - momentum: 0.000000 2023-10-13 22:31:32,596 epoch 10 - iter 594/992 - loss 0.01073008 - time (sec): 35.18 - samples/sec: 2824.57 - lr: 0.000001 - momentum: 0.000000 2023-10-13 22:31:38,266 epoch 10 - iter 693/992 - loss 0.01015657 - time (sec): 40.85 - samples/sec: 2806.63 - lr: 0.000001 - momentum: 0.000000 2023-10-13 22:31:44,347 epoch 10 - iter 792/992 - loss 0.00973608 - time (sec): 46.93 - samples/sec: 2792.56 - lr: 0.000001 - momentum: 0.000000 2023-10-13 22:31:50,396 epoch 10 - iter 891/992 - loss 0.00928079 - time (sec): 52.98 - samples/sec: 2788.54 - lr: 0.000000 - momentum: 0.000000 2023-10-13 22:31:56,043 epoch 10 - iter 990/992 - loss 0.00889257 - time (sec): 58.62 - samples/sec: 2792.15 - lr: 0.000000 - momentum: 0.000000 2023-10-13 22:31:56,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:31:56,156 EPOCH 10 done: loss 0.0089 - lr: 0.000000 2023-10-13 22:31:59,648 DEV : loss 0.21550461649894714 - f1-score (micro avg) 0.7694 2023-10-13 22:32:00,091 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:32:00,092 Loading model from best epoch ... 2023-10-13 22:32:01,548 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-13 22:32:04,837 Results: - F-score (micro) 0.7902 - F-score (macro) 0.7068 - Accuracy 0.6763 By class: precision recall f1-score support LOC 0.8268 0.8748 0.8501 655 PER 0.7026 0.8475 0.7683 223 ORG 0.5259 0.4803 0.5021 127 micro avg 0.7635 0.8189 0.7902 1005 macro avg 0.6851 0.7342 0.7068 1005 weighted avg 0.7612 0.8189 0.7880 1005 2023-10-13 22:32:04,837 ----------------------------------------------------------------------------------------------------