2023-10-13 21:26:19,181 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 Train: 7936 sentences 2023-10-13 21:26:19,182 (train_with_dev=False, train_with_test=False) 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 Training Params: 2023-10-13 21:26:19,182 - learning_rate: "3e-05" 2023-10-13 21:26:19,182 - mini_batch_size: "8" 2023-10-13 21:26:19,182 - max_epochs: "10" 2023-10-13 21:26:19,182 - shuffle: "True" 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 Plugins: 2023-10-13 21:26:19,182 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 21:26:19,182 - metric: "('micro avg', 'f1-score')" 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,182 Computation: 2023-10-13 21:26:19,182 - compute on device: cuda:0 2023-10-13 21:26:19,182 - embedding storage: none 2023-10-13 21:26:19,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,183 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-13 21:26:19,183 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:19,183 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:26:25,065 epoch 1 - iter 99/992 - loss 2.28557417 - time (sec): 5.88 - samples/sec: 2733.74 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:26:31,062 epoch 1 - iter 198/992 - loss 1.39151134 - time (sec): 11.88 - samples/sec: 2741.70 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:26:36,969 epoch 1 - iter 297/992 - loss 1.02546157 - time (sec): 17.79 - samples/sec: 2764.99 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:26:42,777 epoch 1 - iter 396/992 - loss 0.82409225 - time (sec): 23.59 - samples/sec: 2767.68 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:26:48,467 epoch 1 - iter 495/992 - loss 0.69589370 - time (sec): 29.28 - samples/sec: 2780.04 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:26:54,198 epoch 1 - iter 594/992 - loss 0.60583632 - time (sec): 35.01 - samples/sec: 2786.20 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:27:00,081 epoch 1 - iter 693/992 - loss 0.53770132 - time (sec): 40.90 - samples/sec: 2804.16 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:27:06,207 epoch 1 - iter 792/992 - loss 0.48422796 - time (sec): 47.02 - samples/sec: 2810.64 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:27:12,012 epoch 1 - iter 891/992 - loss 0.44737479 - time (sec): 52.83 - samples/sec: 2803.31 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:27:18,204 epoch 1 - iter 990/992 - loss 0.41710425 - time (sec): 59.02 - samples/sec: 2774.99 - lr: 0.000030 - momentum: 0.000000 2023-10-13 21:27:18,322 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:27:18,322 EPOCH 1 done: loss 0.4167 - lr: 0.000030 2023-10-13 21:27:21,503 DEV : loss 0.09923805296421051 - f1-score (micro avg) 0.7086 2023-10-13 21:27:21,525 saving best model 2023-10-13 21:27:21,981 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:27:28,019 epoch 2 - iter 99/992 - loss 0.11613135 - time (sec): 6.04 - samples/sec: 2859.34 - lr: 0.000030 - momentum: 0.000000 2023-10-13 21:27:33,706 epoch 2 - iter 198/992 - loss 0.11269936 - time (sec): 11.72 - samples/sec: 2765.83 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:27:40,128 epoch 2 - iter 297/992 - loss 0.10852299 - time (sec): 18.14 - samples/sec: 2746.57 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:27:45,920 epoch 2 - iter 396/992 - loss 0.10649798 - time (sec): 23.94 - samples/sec: 2705.05 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:27:51,989 epoch 2 - iter 495/992 - loss 0.10652407 - time (sec): 30.01 - samples/sec: 2737.62 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:27:57,928 epoch 2 - iter 594/992 - loss 0.10483546 - time (sec): 35.94 - samples/sec: 2729.34 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:28:04,046 epoch 2 - iter 693/992 - loss 0.10404519 - time (sec): 42.06 - samples/sec: 2724.36 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:28:09,845 epoch 2 - iter 792/992 - loss 0.10351903 - time (sec): 47.86 - samples/sec: 2727.54 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:28:15,662 epoch 2 - iter 891/992 - loss 0.10258918 - time (sec): 53.68 - samples/sec: 2744.25 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:28:21,494 epoch 2 - iter 990/992 - loss 0.10133427 - time (sec): 59.51 - samples/sec: 2751.72 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:28:21,607 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:28:21,607 EPOCH 2 done: loss 0.1013 - lr: 0.000027 2023-10-13 21:28:25,515 DEV : loss 0.0804639682173729 - f1-score (micro avg) 0.7385 2023-10-13 21:28:25,535 saving best model 2023-10-13 21:28:26,011 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:28:31,880 epoch 3 - iter 99/992 - loss 0.06775603 - time (sec): 5.87 - samples/sec: 2791.67 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:28:37,828 epoch 3 - iter 198/992 - loss 0.06948443 - time (sec): 11.82 - samples/sec: 2698.32 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:28:43,969 epoch 3 - iter 297/992 - loss 0.06447600 - time (sec): 17.96 - samples/sec: 2753.14 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:28:50,058 epoch 3 - iter 396/992 - loss 0.06751946 - time (sec): 24.05 - samples/sec: 2771.64 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:28:55,993 epoch 3 - iter 495/992 - loss 0.06685992 - time (sec): 29.98 - samples/sec: 2763.41 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:29:01,692 epoch 3 - iter 594/992 - loss 0.06777144 - time (sec): 35.68 - samples/sec: 2775.56 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:29:07,475 epoch 3 - iter 693/992 - loss 0.06708595 - time (sec): 41.46 - samples/sec: 2773.37 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:29:13,244 epoch 3 - iter 792/992 - loss 0.06891828 - time (sec): 47.23 - samples/sec: 2784.55 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:29:19,038 epoch 3 - iter 891/992 - loss 0.06968673 - time (sec): 53.03 - samples/sec: 2786.41 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:29:24,682 epoch 3 - iter 990/992 - loss 0.06906410 - time (sec): 58.67 - samples/sec: 2787.68 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:29:24,800 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:29:24,801 EPOCH 3 done: loss 0.0690 - lr: 0.000023 2023-10-13 21:29:28,299 DEV : loss 0.1052832305431366 - f1-score (micro avg) 0.7509 2023-10-13 21:29:28,329 saving best model 2023-10-13 21:29:28,851 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:29:35,125 epoch 4 - iter 99/992 - loss 0.04809586 - time (sec): 6.27 - samples/sec: 2636.57 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:29:41,100 epoch 4 - iter 198/992 - loss 0.04947603 - time (sec): 12.25 - samples/sec: 2714.99 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:29:46,795 epoch 4 - iter 297/992 - loss 0.05118762 - time (sec): 17.94 - samples/sec: 2734.62 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:29:52,796 epoch 4 - iter 396/992 - loss 0.04965416 - time (sec): 23.94 - samples/sec: 2717.92 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:29:58,771 epoch 4 - iter 495/992 - loss 0.04848418 - time (sec): 29.92 - samples/sec: 2721.76 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:30:04,653 epoch 4 - iter 594/992 - loss 0.04858443 - time (sec): 35.80 - samples/sec: 2729.07 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:30:10,562 epoch 4 - iter 693/992 - loss 0.04967146 - time (sec): 41.71 - samples/sec: 2732.89 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:30:16,426 epoch 4 - iter 792/992 - loss 0.04938520 - time (sec): 47.57 - samples/sec: 2731.92 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:30:22,330 epoch 4 - iter 891/992 - loss 0.04876000 - time (sec): 53.48 - samples/sec: 2740.55 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:30:28,272 epoch 4 - iter 990/992 - loss 0.04971305 - time (sec): 59.42 - samples/sec: 2755.83 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:30:28,383 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:30:28,383 EPOCH 4 done: loss 0.0497 - lr: 0.000020 2023-10-13 21:30:31,897 DEV : loss 0.12055560946464539 - f1-score (micro avg) 0.7455 2023-10-13 21:30:31,919 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:30:37,673 epoch 5 - iter 99/992 - loss 0.04227405 - time (sec): 5.75 - samples/sec: 2891.99 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:30:43,513 epoch 5 - iter 198/992 - loss 0.03608106 - time (sec): 11.59 - samples/sec: 2842.42 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:30:49,376 epoch 5 - iter 297/992 - loss 0.03886863 - time (sec): 17.46 - samples/sec: 2822.33 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:30:56,245 epoch 5 - iter 396/992 - loss 0.03816738 - time (sec): 24.32 - samples/sec: 2740.89 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:31:02,116 epoch 5 - iter 495/992 - loss 0.03804566 - time (sec): 30.20 - samples/sec: 2755.43 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:31:07,852 epoch 5 - iter 594/992 - loss 0.03689071 - time (sec): 35.93 - samples/sec: 2761.47 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:31:13,786 epoch 5 - iter 693/992 - loss 0.03872646 - time (sec): 41.87 - samples/sec: 2749.53 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:31:19,687 epoch 5 - iter 792/992 - loss 0.03754210 - time (sec): 47.77 - samples/sec: 2754.69 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:31:25,930 epoch 5 - iter 891/992 - loss 0.03755859 - time (sec): 54.01 - samples/sec: 2741.27 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:31:31,815 epoch 5 - iter 990/992 - loss 0.03809129 - time (sec): 59.89 - samples/sec: 2731.36 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:31:31,954 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:31:31,954 EPOCH 5 done: loss 0.0381 - lr: 0.000017 2023-10-13 21:31:35,520 DEV : loss 0.13742151856422424 - f1-score (micro avg) 0.7587 2023-10-13 21:31:35,544 saving best model 2023-10-13 21:31:36,125 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:31:42,337 epoch 6 - iter 99/992 - loss 0.02712909 - time (sec): 6.21 - samples/sec: 2670.73 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:31:48,728 epoch 6 - iter 198/992 - loss 0.02785322 - time (sec): 12.60 - samples/sec: 2683.50 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:31:54,715 epoch 6 - iter 297/992 - loss 0.02848273 - time (sec): 18.59 - samples/sec: 2672.94 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:32:00,706 epoch 6 - iter 396/992 - loss 0.02785739 - time (sec): 24.58 - samples/sec: 2667.29 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:32:06,591 epoch 6 - iter 495/992 - loss 0.02751215 - time (sec): 30.46 - samples/sec: 2689.46 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:32:12,351 epoch 6 - iter 594/992 - loss 0.02798228 - time (sec): 36.22 - samples/sec: 2715.72 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:32:18,302 epoch 6 - iter 693/992 - loss 0.02829397 - time (sec): 42.18 - samples/sec: 2730.18 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:32:24,156 epoch 6 - iter 792/992 - loss 0.02864477 - time (sec): 48.03 - samples/sec: 2736.41 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:32:30,000 epoch 6 - iter 891/992 - loss 0.02875296 - time (sec): 53.87 - samples/sec: 2736.42 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:32:36,022 epoch 6 - iter 990/992 - loss 0.02942813 - time (sec): 59.90 - samples/sec: 2732.62 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:32:36,132 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:32:36,132 EPOCH 6 done: loss 0.0295 - lr: 0.000013 2023-10-13 21:32:39,605 DEV : loss 0.16862468421459198 - f1-score (micro avg) 0.7574 2023-10-13 21:32:39,626 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:32:45,397 epoch 7 - iter 99/992 - loss 0.02084181 - time (sec): 5.77 - samples/sec: 2663.18 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:32:51,692 epoch 7 - iter 198/992 - loss 0.02364712 - time (sec): 12.07 - samples/sec: 2669.23 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:32:57,470 epoch 7 - iter 297/992 - loss 0.02055440 - time (sec): 17.84 - samples/sec: 2677.99 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:33:03,230 epoch 7 - iter 396/992 - loss 0.02123931 - time (sec): 23.60 - samples/sec: 2709.80 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:33:09,233 epoch 7 - iter 495/992 - loss 0.02161191 - time (sec): 29.61 - samples/sec: 2740.96 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:33:15,904 epoch 7 - iter 594/992 - loss 0.02114814 - time (sec): 36.28 - samples/sec: 2714.40 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:33:21,533 epoch 7 - iter 693/992 - loss 0.02119562 - time (sec): 41.91 - samples/sec: 2712.74 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:33:27,488 epoch 7 - iter 792/992 - loss 0.02088100 - time (sec): 47.86 - samples/sec: 2730.44 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:33:33,206 epoch 7 - iter 891/992 - loss 0.02187657 - time (sec): 53.58 - samples/sec: 2744.28 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:33:38,997 epoch 7 - iter 990/992 - loss 0.02186821 - time (sec): 59.37 - samples/sec: 2755.54 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:33:39,128 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:33:39,129 EPOCH 7 done: loss 0.0218 - lr: 0.000010 2023-10-13 21:33:42,635 DEV : loss 0.18575911223888397 - f1-score (micro avg) 0.7592 2023-10-13 21:33:42,667 saving best model 2023-10-13 21:33:43,180 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:33:49,709 epoch 8 - iter 99/992 - loss 0.01242652 - time (sec): 6.52 - samples/sec: 2504.77 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:33:55,885 epoch 8 - iter 198/992 - loss 0.01492552 - time (sec): 12.70 - samples/sec: 2595.31 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:34:01,748 epoch 8 - iter 297/992 - loss 0.01415991 - time (sec): 18.56 - samples/sec: 2631.07 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:34:07,558 epoch 8 - iter 396/992 - loss 0.01548353 - time (sec): 24.37 - samples/sec: 2700.18 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:34:13,303 epoch 8 - iter 495/992 - loss 0.01575741 - time (sec): 30.12 - samples/sec: 2718.52 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:34:18,907 epoch 8 - iter 594/992 - loss 0.01594169 - time (sec): 35.72 - samples/sec: 2723.58 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:34:24,857 epoch 8 - iter 693/992 - loss 0.01654981 - time (sec): 41.67 - samples/sec: 2735.99 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:34:31,096 epoch 8 - iter 792/992 - loss 0.01587979 - time (sec): 47.91 - samples/sec: 2746.56 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:34:36,972 epoch 8 - iter 891/992 - loss 0.01575634 - time (sec): 53.79 - samples/sec: 2741.67 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:34:42,829 epoch 8 - iter 990/992 - loss 0.01614502 - time (sec): 59.64 - samples/sec: 2745.24 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:34:42,936 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:34:42,936 EPOCH 8 done: loss 0.0161 - lr: 0.000007 2023-10-13 21:34:46,404 DEV : loss 0.20498403906822205 - f1-score (micro avg) 0.7587 2023-10-13 21:34:46,426 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:34:52,358 epoch 9 - iter 99/992 - loss 0.01356369 - time (sec): 5.93 - samples/sec: 2823.61 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:34:58,266 epoch 9 - iter 198/992 - loss 0.01060758 - time (sec): 11.84 - samples/sec: 2769.20 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:35:03,917 epoch 9 - iter 297/992 - loss 0.01135131 - time (sec): 17.49 - samples/sec: 2781.11 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:35:09,966 epoch 9 - iter 396/992 - loss 0.01091601 - time (sec): 23.54 - samples/sec: 2794.35 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:35:15,776 epoch 9 - iter 495/992 - loss 0.01147457 - time (sec): 29.35 - samples/sec: 2811.81 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:35:21,554 epoch 9 - iter 594/992 - loss 0.01149657 - time (sec): 35.13 - samples/sec: 2792.52 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:35:27,248 epoch 9 - iter 693/992 - loss 0.01184055 - time (sec): 40.82 - samples/sec: 2796.65 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:35:33,220 epoch 9 - iter 792/992 - loss 0.01192689 - time (sec): 46.79 - samples/sec: 2799.57 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:35:39,316 epoch 9 - iter 891/992 - loss 0.01323888 - time (sec): 52.89 - samples/sec: 2795.24 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:35:45,243 epoch 9 - iter 990/992 - loss 0.01337615 - time (sec): 58.82 - samples/sec: 2781.18 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:35:45,385 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:35:45,385 EPOCH 9 done: loss 0.0134 - lr: 0.000003 2023-10-13 21:35:49,385 DEV : loss 0.20130668580532074 - f1-score (micro avg) 0.7596 2023-10-13 21:35:49,409 saving best model 2023-10-13 21:35:49,998 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:35:56,107 epoch 10 - iter 99/992 - loss 0.01231225 - time (sec): 6.11 - samples/sec: 2791.64 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:36:01,882 epoch 10 - iter 198/992 - loss 0.00908713 - time (sec): 11.88 - samples/sec: 2758.24 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:36:07,740 epoch 10 - iter 297/992 - loss 0.00865854 - time (sec): 17.74 - samples/sec: 2751.19 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:36:13,958 epoch 10 - iter 396/992 - loss 0.00923764 - time (sec): 23.96 - samples/sec: 2731.85 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:36:19,594 epoch 10 - iter 495/992 - loss 0.00934583 - time (sec): 29.59 - samples/sec: 2753.35 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:36:25,379 epoch 10 - iter 594/992 - loss 0.00873248 - time (sec): 35.38 - samples/sec: 2769.19 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:36:31,203 epoch 10 - iter 693/992 - loss 0.00930823 - time (sec): 41.20 - samples/sec: 2774.68 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:36:37,052 epoch 10 - iter 792/992 - loss 0.00939506 - time (sec): 47.05 - samples/sec: 2782.82 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:36:42,967 epoch 10 - iter 891/992 - loss 0.00942042 - time (sec): 52.97 - samples/sec: 2794.26 - lr: 0.000000 - momentum: 0.000000 2023-10-13 21:36:48,640 epoch 10 - iter 990/992 - loss 0.00923242 - time (sec): 58.64 - samples/sec: 2790.71 - lr: 0.000000 - momentum: 0.000000 2023-10-13 21:36:48,750 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:36:48,750 EPOCH 10 done: loss 0.0092 - lr: 0.000000 2023-10-13 21:36:52,276 DEV : loss 0.21729230880737305 - f1-score (micro avg) 0.7547 2023-10-13 21:36:52,716 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:36:52,717 Loading model from best epoch ... 2023-10-13 21:36:54,254 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-13 21:36:57,774 Results: - F-score (micro) 0.7742 - F-score (macro) 0.7002 - Accuracy 0.6494 By class: precision recall f1-score support LOC 0.7980 0.8504 0.8234 655 PER 0.7312 0.8296 0.7773 223 ORG 0.5124 0.4882 0.5000 127 micro avg 0.7500 0.8000 0.7742 1005 macro avg 0.6805 0.7227 0.7002 1005 weighted avg 0.7471 0.8000 0.7723 1005 2023-10-13 21:36:57,774 ----------------------------------------------------------------------------------------------------