2023-10-14 23:13:54,062 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 23:13:54,063 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-14 23:13:54,063 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 Train: 14465 sentences 2023-10-14 23:13:54,063 (train_with_dev=False, train_with_test=False) 2023-10-14 23:13:54,063 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 Training Params: 2023-10-14 23:13:54,063 - learning_rate: "5e-05" 2023-10-14 23:13:54,063 - mini_batch_size: "8" 2023-10-14 23:13:54,063 - max_epochs: "10" 2023-10-14 23:13:54,063 - shuffle: "True" 2023-10-14 23:13:54,063 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 Plugins: 2023-10-14 23:13:54,063 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 23:13:54,063 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 23:13:54,063 - metric: "('micro avg', 'f1-score')" 2023-10-14 23:13:54,063 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,063 Computation: 2023-10-14 23:13:54,063 - compute on device: cuda:0 2023-10-14 23:13:54,064 - embedding storage: none 2023-10-14 23:13:54,064 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,064 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-14 23:13:54,064 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:13:54,064 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:14:05,117 epoch 1 - iter 180/1809 - loss 1.38557352 - time (sec): 11.05 - samples/sec: 3462.40 - lr: 0.000005 - momentum: 0.000000 2023-10-14 23:14:16,159 epoch 1 - iter 360/1809 - loss 0.80804825 - time (sec): 22.09 - samples/sec: 3436.57 - lr: 0.000010 - momentum: 0.000000 2023-10-14 23:14:27,057 epoch 1 - iter 540/1809 - loss 0.59504882 - time (sec): 32.99 - samples/sec: 3419.21 - lr: 0.000015 - momentum: 0.000000 2023-10-14 23:14:38,311 epoch 1 - iter 720/1809 - loss 0.47345095 - time (sec): 44.25 - samples/sec: 3442.23 - lr: 0.000020 - momentum: 0.000000 2023-10-14 23:14:49,306 epoch 1 - iter 900/1809 - loss 0.40291276 - time (sec): 55.24 - samples/sec: 3439.24 - lr: 0.000025 - momentum: 0.000000 2023-10-14 23:15:00,620 epoch 1 - iter 1080/1809 - loss 0.35417304 - time (sec): 66.56 - samples/sec: 3444.05 - lr: 0.000030 - momentum: 0.000000 2023-10-14 23:15:11,503 epoch 1 - iter 1260/1809 - loss 0.32111486 - time (sec): 77.44 - samples/sec: 3438.10 - lr: 0.000035 - momentum: 0.000000 2023-10-14 23:15:22,452 epoch 1 - iter 1440/1809 - loss 0.29407849 - time (sec): 88.39 - samples/sec: 3437.09 - lr: 0.000040 - momentum: 0.000000 2023-10-14 23:15:33,541 epoch 1 - iter 1620/1809 - loss 0.27249387 - time (sec): 99.48 - samples/sec: 3427.40 - lr: 0.000045 - momentum: 0.000000 2023-10-14 23:15:44,480 epoch 1 - iter 1800/1809 - loss 0.25650520 - time (sec): 110.42 - samples/sec: 3426.34 - lr: 0.000050 - momentum: 0.000000 2023-10-14 23:15:44,991 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:15:44,991 EPOCH 1 done: loss 0.2558 - lr: 0.000050 2023-10-14 23:15:50,308 DEV : loss 0.10922493785619736 - f1-score (micro avg) 0.5898 2023-10-14 23:15:50,338 saving best model 2023-10-14 23:15:50,724 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:16:01,639 epoch 2 - iter 180/1809 - loss 0.09115416 - time (sec): 10.91 - samples/sec: 3394.64 - lr: 0.000049 - momentum: 0.000000 2023-10-14 23:16:12,680 epoch 2 - iter 360/1809 - loss 0.09214244 - time (sec): 21.95 - samples/sec: 3428.42 - lr: 0.000049 - momentum: 0.000000 2023-10-14 23:16:23,647 epoch 2 - iter 540/1809 - loss 0.08998102 - time (sec): 32.92 - samples/sec: 3449.64 - lr: 0.000048 - momentum: 0.000000 2023-10-14 23:16:34,929 epoch 2 - iter 720/1809 - loss 0.08845021 - time (sec): 44.20 - samples/sec: 3448.58 - lr: 0.000048 - momentum: 0.000000 2023-10-14 23:16:45,932 epoch 2 - iter 900/1809 - loss 0.08692869 - time (sec): 55.21 - samples/sec: 3466.90 - lr: 0.000047 - momentum: 0.000000 2023-10-14 23:16:57,081 epoch 2 - iter 1080/1809 - loss 0.08601996 - time (sec): 66.36 - samples/sec: 3452.03 - lr: 0.000047 - momentum: 0.000000 2023-10-14 23:17:08,054 epoch 2 - iter 1260/1809 - loss 0.08669678 - time (sec): 77.33 - samples/sec: 3443.25 - lr: 0.000046 - momentum: 0.000000 2023-10-14 23:17:18,723 epoch 2 - iter 1440/1809 - loss 0.08604451 - time (sec): 88.00 - samples/sec: 3433.05 - lr: 0.000046 - momentum: 0.000000 2023-10-14 23:17:29,971 epoch 2 - iter 1620/1809 - loss 0.08524993 - time (sec): 99.25 - samples/sec: 3434.57 - lr: 0.000045 - momentum: 0.000000 2023-10-14 23:17:40,783 epoch 2 - iter 1800/1809 - loss 0.08442574 - time (sec): 110.06 - samples/sec: 3435.34 - lr: 0.000044 - momentum: 0.000000 2023-10-14 23:17:41,312 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:17:41,313 EPOCH 2 done: loss 0.0843 - lr: 0.000044 2023-10-14 23:17:47,616 DEV : loss 0.11877861618995667 - f1-score (micro avg) 0.6492 2023-10-14 23:17:47,664 saving best model 2023-10-14 23:17:48,158 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:17:59,754 epoch 3 - iter 180/1809 - loss 0.05224098 - time (sec): 11.59 - samples/sec: 3334.61 - lr: 0.000044 - momentum: 0.000000 2023-10-14 23:18:10,717 epoch 3 - iter 360/1809 - loss 0.05757660 - time (sec): 22.56 - samples/sec: 3391.79 - lr: 0.000043 - momentum: 0.000000 2023-10-14 23:18:21,815 epoch 3 - iter 540/1809 - loss 0.05923171 - time (sec): 33.65 - samples/sec: 3374.86 - lr: 0.000043 - momentum: 0.000000 2023-10-14 23:18:32,890 epoch 3 - iter 720/1809 - loss 0.05951510 - time (sec): 44.73 - samples/sec: 3404.49 - lr: 0.000042 - momentum: 0.000000 2023-10-14 23:18:44,341 epoch 3 - iter 900/1809 - loss 0.05917685 - time (sec): 56.18 - samples/sec: 3376.67 - lr: 0.000042 - momentum: 0.000000 2023-10-14 23:18:55,811 epoch 3 - iter 1080/1809 - loss 0.05994787 - time (sec): 67.65 - samples/sec: 3358.55 - lr: 0.000041 - momentum: 0.000000 2023-10-14 23:19:07,534 epoch 3 - iter 1260/1809 - loss 0.05928422 - time (sec): 79.37 - samples/sec: 3339.69 - lr: 0.000041 - momentum: 0.000000 2023-10-14 23:19:19,370 epoch 3 - iter 1440/1809 - loss 0.05933673 - time (sec): 91.21 - samples/sec: 3312.36 - lr: 0.000040 - momentum: 0.000000 2023-10-14 23:19:30,948 epoch 3 - iter 1620/1809 - loss 0.06123094 - time (sec): 102.79 - samples/sec: 3312.93 - lr: 0.000039 - momentum: 0.000000 2023-10-14 23:19:41,857 epoch 3 - iter 1800/1809 - loss 0.05963446 - time (sec): 113.70 - samples/sec: 3325.96 - lr: 0.000039 - momentum: 0.000000 2023-10-14 23:19:42,476 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:19:42,476 EPOCH 3 done: loss 0.0595 - lr: 0.000039 2023-10-14 23:19:49,864 DEV : loss 0.1852269172668457 - f1-score (micro avg) 0.6066 2023-10-14 23:19:49,907 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:20:01,922 epoch 4 - iter 180/1809 - loss 0.03466777 - time (sec): 12.01 - samples/sec: 3240.89 - lr: 0.000038 - momentum: 0.000000 2023-10-14 23:20:12,957 epoch 4 - iter 360/1809 - loss 0.03740752 - time (sec): 23.05 - samples/sec: 3315.19 - lr: 0.000038 - momentum: 0.000000 2023-10-14 23:20:24,052 epoch 4 - iter 540/1809 - loss 0.04243438 - time (sec): 34.14 - samples/sec: 3350.72 - lr: 0.000037 - momentum: 0.000000 2023-10-14 23:20:35,171 epoch 4 - iter 720/1809 - loss 0.04361553 - time (sec): 45.26 - samples/sec: 3339.38 - lr: 0.000037 - momentum: 0.000000 2023-10-14 23:20:46,045 epoch 4 - iter 900/1809 - loss 0.04259376 - time (sec): 56.14 - samples/sec: 3357.05 - lr: 0.000036 - momentum: 0.000000 2023-10-14 23:20:57,326 epoch 4 - iter 1080/1809 - loss 0.04216192 - time (sec): 67.42 - samples/sec: 3361.99 - lr: 0.000036 - momentum: 0.000000 2023-10-14 23:21:08,492 epoch 4 - iter 1260/1809 - loss 0.04211285 - time (sec): 78.58 - samples/sec: 3368.00 - lr: 0.000035 - momentum: 0.000000 2023-10-14 23:21:19,539 epoch 4 - iter 1440/1809 - loss 0.04181257 - time (sec): 89.63 - samples/sec: 3382.30 - lr: 0.000034 - momentum: 0.000000 2023-10-14 23:21:30,326 epoch 4 - iter 1620/1809 - loss 0.04312238 - time (sec): 100.42 - samples/sec: 3393.06 - lr: 0.000034 - momentum: 0.000000 2023-10-14 23:21:41,155 epoch 4 - iter 1800/1809 - loss 0.04326423 - time (sec): 111.25 - samples/sec: 3398.47 - lr: 0.000033 - momentum: 0.000000 2023-10-14 23:21:41,677 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:21:41,677 EPOCH 4 done: loss 0.0433 - lr: 0.000033 2023-10-14 23:21:48,355 DEV : loss 0.21803739666938782 - f1-score (micro avg) 0.6213 2023-10-14 23:21:48,388 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:21:59,544 epoch 5 - iter 180/1809 - loss 0.03382032 - time (sec): 11.16 - samples/sec: 3385.98 - lr: 0.000033 - momentum: 0.000000 2023-10-14 23:22:10,312 epoch 5 - iter 360/1809 - loss 0.02988533 - time (sec): 21.92 - samples/sec: 3446.91 - lr: 0.000032 - momentum: 0.000000 2023-10-14 23:22:21,368 epoch 5 - iter 540/1809 - loss 0.02897812 - time (sec): 32.98 - samples/sec: 3436.13 - lr: 0.000032 - momentum: 0.000000 2023-10-14 23:22:32,201 epoch 5 - iter 720/1809 - loss 0.02920811 - time (sec): 43.81 - samples/sec: 3422.99 - lr: 0.000031 - momentum: 0.000000 2023-10-14 23:22:43,261 epoch 5 - iter 900/1809 - loss 0.03020770 - time (sec): 54.87 - samples/sec: 3422.06 - lr: 0.000031 - momentum: 0.000000 2023-10-14 23:22:54,098 epoch 5 - iter 1080/1809 - loss 0.03043284 - time (sec): 65.71 - samples/sec: 3422.19 - lr: 0.000030 - momentum: 0.000000 2023-10-14 23:23:04,857 epoch 5 - iter 1260/1809 - loss 0.03101516 - time (sec): 76.47 - samples/sec: 3423.04 - lr: 0.000029 - momentum: 0.000000 2023-10-14 23:23:16,285 epoch 5 - iter 1440/1809 - loss 0.03128116 - time (sec): 87.90 - samples/sec: 3425.18 - lr: 0.000029 - momentum: 0.000000 2023-10-14 23:23:27,369 epoch 5 - iter 1620/1809 - loss 0.03158724 - time (sec): 98.98 - samples/sec: 3432.98 - lr: 0.000028 - momentum: 0.000000 2023-10-14 23:23:38,672 epoch 5 - iter 1800/1809 - loss 0.03222026 - time (sec): 110.28 - samples/sec: 3430.35 - lr: 0.000028 - momentum: 0.000000 2023-10-14 23:23:39,199 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:23:39,199 EPOCH 5 done: loss 0.0324 - lr: 0.000028 2023-10-14 23:23:44,781 DEV : loss 0.29204583168029785 - f1-score (micro avg) 0.6381 2023-10-14 23:23:44,816 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:23:55,602 epoch 6 - iter 180/1809 - loss 0.01835359 - time (sec): 10.78 - samples/sec: 3331.85 - lr: 0.000027 - momentum: 0.000000 2023-10-14 23:24:06,716 epoch 6 - iter 360/1809 - loss 0.02104928 - time (sec): 21.90 - samples/sec: 3383.12 - lr: 0.000027 - momentum: 0.000000 2023-10-14 23:24:17,593 epoch 6 - iter 540/1809 - loss 0.02382736 - time (sec): 32.78 - samples/sec: 3387.15 - lr: 0.000026 - momentum: 0.000000 2023-10-14 23:24:29,242 epoch 6 - iter 720/1809 - loss 0.02520287 - time (sec): 44.43 - samples/sec: 3348.00 - lr: 0.000026 - momentum: 0.000000 2023-10-14 23:24:40,477 epoch 6 - iter 900/1809 - loss 0.02464033 - time (sec): 55.66 - samples/sec: 3367.84 - lr: 0.000025 - momentum: 0.000000 2023-10-14 23:24:51,833 epoch 6 - iter 1080/1809 - loss 0.02490246 - time (sec): 67.02 - samples/sec: 3365.05 - lr: 0.000024 - momentum: 0.000000 2023-10-14 23:25:02,971 epoch 6 - iter 1260/1809 - loss 0.02386734 - time (sec): 78.15 - samples/sec: 3385.64 - lr: 0.000024 - momentum: 0.000000 2023-10-14 23:25:14,059 epoch 6 - iter 1440/1809 - loss 0.02384248 - time (sec): 89.24 - samples/sec: 3407.87 - lr: 0.000023 - momentum: 0.000000 2023-10-14 23:25:24,684 epoch 6 - iter 1620/1809 - loss 0.02370424 - time (sec): 99.87 - samples/sec: 3405.70 - lr: 0.000023 - momentum: 0.000000 2023-10-14 23:25:35,893 epoch 6 - iter 1800/1809 - loss 0.02323564 - time (sec): 111.08 - samples/sec: 3403.90 - lr: 0.000022 - momentum: 0.000000 2023-10-14 23:25:36,460 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:25:36,460 EPOCH 6 done: loss 0.0232 - lr: 0.000022 2023-10-14 23:25:41,960 DEV : loss 0.325937956571579 - f1-score (micro avg) 0.6292 2023-10-14 23:25:41,990 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:25:53,211 epoch 7 - iter 180/1809 - loss 0.01268428 - time (sec): 11.22 - samples/sec: 3309.54 - lr: 0.000022 - momentum: 0.000000 2023-10-14 23:26:04,242 epoch 7 - iter 360/1809 - loss 0.01253572 - time (sec): 22.25 - samples/sec: 3434.72 - lr: 0.000021 - momentum: 0.000000 2023-10-14 23:26:15,278 epoch 7 - iter 540/1809 - loss 0.01170739 - time (sec): 33.29 - samples/sec: 3474.17 - lr: 0.000021 - momentum: 0.000000 2023-10-14 23:26:26,064 epoch 7 - iter 720/1809 - loss 0.01321558 - time (sec): 44.07 - samples/sec: 3453.48 - lr: 0.000020 - momentum: 0.000000 2023-10-14 23:26:37,214 epoch 7 - iter 900/1809 - loss 0.01522502 - time (sec): 55.22 - samples/sec: 3442.44 - lr: 0.000019 - momentum: 0.000000 2023-10-14 23:26:48,017 epoch 7 - iter 1080/1809 - loss 0.01503871 - time (sec): 66.03 - samples/sec: 3451.90 - lr: 0.000019 - momentum: 0.000000 2023-10-14 23:26:58,958 epoch 7 - iter 1260/1809 - loss 0.01509298 - time (sec): 76.97 - samples/sec: 3464.90 - lr: 0.000018 - momentum: 0.000000 2023-10-14 23:27:09,510 epoch 7 - iter 1440/1809 - loss 0.01494629 - time (sec): 87.52 - samples/sec: 3455.94 - lr: 0.000018 - momentum: 0.000000 2023-10-14 23:27:21,399 epoch 7 - iter 1620/1809 - loss 0.01502665 - time (sec): 99.41 - samples/sec: 3423.29 - lr: 0.000017 - momentum: 0.000000 2023-10-14 23:27:32,509 epoch 7 - iter 1800/1809 - loss 0.01515155 - time (sec): 110.52 - samples/sec: 3422.67 - lr: 0.000017 - momentum: 0.000000 2023-10-14 23:27:33,040 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:27:33,041 EPOCH 7 done: loss 0.0151 - lr: 0.000017 2023-10-14 23:27:38,689 DEV : loss 0.34486714005470276 - f1-score (micro avg) 0.6394 2023-10-14 23:27:38,721 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:27:49,645 epoch 8 - iter 180/1809 - loss 0.00504407 - time (sec): 10.92 - samples/sec: 3368.16 - lr: 0.000016 - momentum: 0.000000 2023-10-14 23:28:00,921 epoch 8 - iter 360/1809 - loss 0.00779998 - time (sec): 22.20 - samples/sec: 3393.34 - lr: 0.000016 - momentum: 0.000000 2023-10-14 23:28:12,199 epoch 8 - iter 540/1809 - loss 0.00806594 - time (sec): 33.48 - samples/sec: 3389.11 - lr: 0.000015 - momentum: 0.000000 2023-10-14 23:28:23,292 epoch 8 - iter 720/1809 - loss 0.00804521 - time (sec): 44.57 - samples/sec: 3387.81 - lr: 0.000014 - momentum: 0.000000 2023-10-14 23:28:34,228 epoch 8 - iter 900/1809 - loss 0.00787755 - time (sec): 55.51 - samples/sec: 3387.75 - lr: 0.000014 - momentum: 0.000000 2023-10-14 23:28:45,339 epoch 8 - iter 1080/1809 - loss 0.00837508 - time (sec): 66.62 - samples/sec: 3398.85 - lr: 0.000013 - momentum: 0.000000 2023-10-14 23:28:56,214 epoch 8 - iter 1260/1809 - loss 0.00920763 - time (sec): 77.49 - samples/sec: 3418.01 - lr: 0.000013 - momentum: 0.000000 2023-10-14 23:29:07,281 epoch 8 - iter 1440/1809 - loss 0.01023911 - time (sec): 88.56 - samples/sec: 3413.01 - lr: 0.000012 - momentum: 0.000000 2023-10-14 23:29:18,361 epoch 8 - iter 1620/1809 - loss 0.01031827 - time (sec): 99.64 - samples/sec: 3406.77 - lr: 0.000012 - momentum: 0.000000 2023-10-14 23:29:29,759 epoch 8 - iter 1800/1809 - loss 0.01052622 - time (sec): 111.04 - samples/sec: 3408.30 - lr: 0.000011 - momentum: 0.000000 2023-10-14 23:29:30,243 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:29:30,243 EPOCH 8 done: loss 0.0105 - lr: 0.000011 2023-10-14 23:29:38,316 DEV : loss 0.37847810983657837 - f1-score (micro avg) 0.6439 2023-10-14 23:29:38,350 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:29:49,652 epoch 9 - iter 180/1809 - loss 0.00514140 - time (sec): 11.30 - samples/sec: 3379.90 - lr: 0.000011 - momentum: 0.000000 2023-10-14 23:30:00,849 epoch 9 - iter 360/1809 - loss 0.00528666 - time (sec): 22.50 - samples/sec: 3363.85 - lr: 0.000010 - momentum: 0.000000 2023-10-14 23:30:11,917 epoch 9 - iter 540/1809 - loss 0.00706337 - time (sec): 33.57 - samples/sec: 3376.04 - lr: 0.000009 - momentum: 0.000000 2023-10-14 23:30:23,087 epoch 9 - iter 720/1809 - loss 0.00660810 - time (sec): 44.74 - samples/sec: 3397.13 - lr: 0.000009 - momentum: 0.000000 2023-10-14 23:30:34,171 epoch 9 - iter 900/1809 - loss 0.00744791 - time (sec): 55.82 - samples/sec: 3402.85 - lr: 0.000008 - momentum: 0.000000 2023-10-14 23:30:45,401 epoch 9 - iter 1080/1809 - loss 0.00728486 - time (sec): 67.05 - samples/sec: 3414.58 - lr: 0.000008 - momentum: 0.000000 2023-10-14 23:30:56,449 epoch 9 - iter 1260/1809 - loss 0.00750422 - time (sec): 78.10 - samples/sec: 3417.96 - lr: 0.000007 - momentum: 0.000000 2023-10-14 23:31:07,262 epoch 9 - iter 1440/1809 - loss 0.00739246 - time (sec): 88.91 - samples/sec: 3414.94 - lr: 0.000007 - momentum: 0.000000 2023-10-14 23:31:18,285 epoch 9 - iter 1620/1809 - loss 0.00714565 - time (sec): 99.93 - samples/sec: 3420.75 - lr: 0.000006 - momentum: 0.000000 2023-10-14 23:31:28,970 epoch 9 - iter 1800/1809 - loss 0.00700843 - time (sec): 110.62 - samples/sec: 3418.39 - lr: 0.000006 - momentum: 0.000000 2023-10-14 23:31:29,513 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:31:29,513 EPOCH 9 done: loss 0.0070 - lr: 0.000006 2023-10-14 23:31:35,977 DEV : loss 0.39124542474746704 - f1-score (micro avg) 0.6438 2023-10-14 23:31:36,014 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:31:47,439 epoch 10 - iter 180/1809 - loss 0.00408599 - time (sec): 11.42 - samples/sec: 3336.10 - lr: 0.000005 - momentum: 0.000000 2023-10-14 23:31:58,371 epoch 10 - iter 360/1809 - loss 0.00331461 - time (sec): 22.36 - samples/sec: 3400.13 - lr: 0.000004 - momentum: 0.000000 2023-10-14 23:32:09,317 epoch 10 - iter 540/1809 - loss 0.00293145 - time (sec): 33.30 - samples/sec: 3393.56 - lr: 0.000004 - momentum: 0.000000 2023-10-14 23:32:20,482 epoch 10 - iter 720/1809 - loss 0.00380863 - time (sec): 44.47 - samples/sec: 3405.89 - lr: 0.000003 - momentum: 0.000000 2023-10-14 23:32:31,512 epoch 10 - iter 900/1809 - loss 0.00352863 - time (sec): 55.50 - samples/sec: 3420.09 - lr: 0.000003 - momentum: 0.000000 2023-10-14 23:32:42,367 epoch 10 - iter 1080/1809 - loss 0.00358662 - time (sec): 66.35 - samples/sec: 3425.15 - lr: 0.000002 - momentum: 0.000000 2023-10-14 23:32:53,206 epoch 10 - iter 1260/1809 - loss 0.00382641 - time (sec): 77.19 - samples/sec: 3431.10 - lr: 0.000002 - momentum: 0.000000 2023-10-14 23:33:04,379 epoch 10 - iter 1440/1809 - loss 0.00369667 - time (sec): 88.36 - samples/sec: 3435.24 - lr: 0.000001 - momentum: 0.000000 2023-10-14 23:33:15,225 epoch 10 - iter 1620/1809 - loss 0.00426803 - time (sec): 99.21 - samples/sec: 3416.99 - lr: 0.000001 - momentum: 0.000000 2023-10-14 23:33:26,612 epoch 10 - iter 1800/1809 - loss 0.00418167 - time (sec): 110.60 - samples/sec: 3421.02 - lr: 0.000000 - momentum: 0.000000 2023-10-14 23:33:27,097 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:33:27,097 EPOCH 10 done: loss 0.0042 - lr: 0.000000 2023-10-14 23:33:34,503 DEV : loss 0.40973159670829773 - f1-score (micro avg) 0.643 2023-10-14 23:33:34,924 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:33:34,926 Loading model from best epoch ... 2023-10-14 23:33:36,531 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-14 23:33:44,358 Results: - F-score (micro) 0.6361 - F-score (macro) 0.4367 - Accuracy 0.4774 By class: precision recall f1-score support loc 0.6206 0.7750 0.6892 591 pers 0.5596 0.6975 0.6209 357 org 0.0000 0.0000 0.0000 79 micro avg 0.5911 0.6884 0.6361 1027 macro avg 0.3934 0.4908 0.4367 1027 weighted avg 0.5516 0.6884 0.6125 1027 2023-10-14 23:33:44,359 ----------------------------------------------------------------------------------------------------