2023-10-17 10:39:35,326 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,327 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:39:35,327 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,327 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 10:39:35,327 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,327 Train: 966 sentences 2023-10-17 10:39:35,327 (train_with_dev=False, train_with_test=False) 2023-10-17 10:39:35,327 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,327 Training Params: 2023-10-17 10:39:35,327 - learning_rate: "3e-05" 2023-10-17 10:39:35,327 - mini_batch_size: "8" 2023-10-17 10:39:35,327 - max_epochs: "10" 2023-10-17 10:39:35,327 - shuffle: "True" 2023-10-17 10:39:35,327 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,327 Plugins: 2023-10-17 10:39:35,327 - TensorboardLogger 2023-10-17 10:39:35,327 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:39:35,327 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,327 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:39:35,327 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:39:35,328 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,328 Computation: 2023-10-17 10:39:35,328 - compute on device: cuda:0 2023-10-17 10:39:35,328 - embedding storage: none 2023-10-17 10:39:35,328 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,328 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 10:39:35,328 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,328 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:35,328 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:39:36,109 epoch 1 - iter 12/121 - loss 4.25722385 - time (sec): 0.78 - samples/sec: 2872.89 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:39:36,809 epoch 1 - iter 24/121 - loss 4.00564937 - time (sec): 1.48 - samples/sec: 3154.85 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:39:37,642 epoch 1 - iter 36/121 - loss 3.40093382 - time (sec): 2.31 - samples/sec: 3172.82 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:39:38,404 epoch 1 - iter 48/121 - loss 2.78062785 - time (sec): 3.07 - samples/sec: 3143.94 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:39:39,161 epoch 1 - iter 60/121 - loss 2.35231184 - time (sec): 3.83 - samples/sec: 3199.31 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:39:39,973 epoch 1 - iter 72/121 - loss 2.06127860 - time (sec): 4.64 - samples/sec: 3147.72 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:39:40,735 epoch 1 - iter 84/121 - loss 1.81693509 - time (sec): 5.41 - samples/sec: 3200.90 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:39:41,470 epoch 1 - iter 96/121 - loss 1.67323399 - time (sec): 6.14 - samples/sec: 3170.67 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:39:42,303 epoch 1 - iter 108/121 - loss 1.51544682 - time (sec): 6.97 - samples/sec: 3172.12 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:39:43,069 epoch 1 - iter 120/121 - loss 1.39049965 - time (sec): 7.74 - samples/sec: 3180.24 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:39:43,123 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:43,124 EPOCH 1 done: loss 1.3838 - lr: 0.000030 2023-10-17 10:39:43,777 DEV : loss 0.251099556684494 - f1-score (micro avg) 0.5079 2023-10-17 10:39:43,787 saving best model 2023-10-17 10:39:44,242 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:44,933 epoch 2 - iter 12/121 - loss 0.26699978 - time (sec): 0.69 - samples/sec: 3362.95 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:39:45,686 epoch 2 - iter 24/121 - loss 0.27485034 - time (sec): 1.44 - samples/sec: 3098.99 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:39:46,440 epoch 2 - iter 36/121 - loss 0.27317472 - time (sec): 2.20 - samples/sec: 3168.93 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:39:47,147 epoch 2 - iter 48/121 - loss 0.25723374 - time (sec): 2.90 - samples/sec: 3131.61 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:39:47,919 epoch 2 - iter 60/121 - loss 0.24909349 - time (sec): 3.68 - samples/sec: 3140.85 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:39:48,756 epoch 2 - iter 72/121 - loss 0.23902434 - time (sec): 4.51 - samples/sec: 3138.67 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:39:49,474 epoch 2 - iter 84/121 - loss 0.23187052 - time (sec): 5.23 - samples/sec: 3187.32 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:39:50,215 epoch 2 - iter 96/121 - loss 0.22252914 - time (sec): 5.97 - samples/sec: 3228.69 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:39:51,016 epoch 2 - iter 108/121 - loss 0.21712543 - time (sec): 6.77 - samples/sec: 3218.14 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:39:51,746 epoch 2 - iter 120/121 - loss 0.21132328 - time (sec): 7.50 - samples/sec: 3262.76 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:39:51,826 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:51,826 EPOCH 2 done: loss 0.2094 - lr: 0.000027 2023-10-17 10:39:52,611 DEV : loss 0.14902380108833313 - f1-score (micro avg) 0.75 2023-10-17 10:39:52,617 saving best model 2023-10-17 10:39:53,122 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:39:53,897 epoch 3 - iter 12/121 - loss 0.12685344 - time (sec): 0.77 - samples/sec: 3347.70 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:39:54,635 epoch 3 - iter 24/121 - loss 0.13397141 - time (sec): 1.51 - samples/sec: 3458.76 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:39:55,449 epoch 3 - iter 36/121 - loss 0.13558235 - time (sec): 2.33 - samples/sec: 3364.66 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:39:56,155 epoch 3 - iter 48/121 - loss 0.13474115 - time (sec): 3.03 - samples/sec: 3288.55 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:39:56,871 epoch 3 - iter 60/121 - loss 0.13146380 - time (sec): 3.75 - samples/sec: 3298.28 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:39:57,695 epoch 3 - iter 72/121 - loss 0.13411500 - time (sec): 4.57 - samples/sec: 3238.48 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:39:58,415 epoch 3 - iter 84/121 - loss 0.13560617 - time (sec): 5.29 - samples/sec: 3284.45 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:39:59,143 epoch 3 - iter 96/121 - loss 0.13216345 - time (sec): 6.02 - samples/sec: 3237.82 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:39:59,872 epoch 3 - iter 108/121 - loss 0.12725540 - time (sec): 6.75 - samples/sec: 3254.99 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:40:00,620 epoch 3 - iter 120/121 - loss 0.12351647 - time (sec): 7.50 - samples/sec: 3282.36 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:40:00,666 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:00,666 EPOCH 3 done: loss 0.1229 - lr: 0.000023 2023-10-17 10:40:01,626 DEV : loss 0.14644193649291992 - f1-score (micro avg) 0.7974 2023-10-17 10:40:01,632 saving best model 2023-10-17 10:40:02,148 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:02,872 epoch 4 - iter 12/121 - loss 0.11855343 - time (sec): 0.72 - samples/sec: 3318.24 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:40:03,569 epoch 4 - iter 24/121 - loss 0.11086880 - time (sec): 1.42 - samples/sec: 3358.02 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:40:04,292 epoch 4 - iter 36/121 - loss 0.09714372 - time (sec): 2.14 - samples/sec: 3198.21 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:40:05,039 epoch 4 - iter 48/121 - loss 0.09071078 - time (sec): 2.89 - samples/sec: 3157.80 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:40:05,821 epoch 4 - iter 60/121 - loss 0.09150231 - time (sec): 3.67 - samples/sec: 3188.65 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:40:06,595 epoch 4 - iter 72/121 - loss 0.08403862 - time (sec): 4.44 - samples/sec: 3289.27 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:40:07,362 epoch 4 - iter 84/121 - loss 0.08179926 - time (sec): 5.21 - samples/sec: 3286.29 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:40:08,181 epoch 4 - iter 96/121 - loss 0.08496227 - time (sec): 6.03 - samples/sec: 3289.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:40:08,894 epoch 4 - iter 108/121 - loss 0.08355173 - time (sec): 6.74 - samples/sec: 3304.38 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:40:09,660 epoch 4 - iter 120/121 - loss 0.08565829 - time (sec): 7.51 - samples/sec: 3276.12 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:40:09,706 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:09,707 EPOCH 4 done: loss 0.0857 - lr: 0.000020 2023-10-17 10:40:10,514 DEV : loss 0.14408640563488007 - f1-score (micro avg) 0.8069 2023-10-17 10:40:10,521 saving best model 2023-10-17 10:40:11,050 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:11,843 epoch 5 - iter 12/121 - loss 0.06469600 - time (sec): 0.79 - samples/sec: 3149.16 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:40:12,700 epoch 5 - iter 24/121 - loss 0.06320875 - time (sec): 1.65 - samples/sec: 3129.05 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:40:13,483 epoch 5 - iter 36/121 - loss 0.05527648 - time (sec): 2.43 - samples/sec: 3190.52 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:40:14,248 epoch 5 - iter 48/121 - loss 0.05466742 - time (sec): 3.20 - samples/sec: 3259.54 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:40:15,013 epoch 5 - iter 60/121 - loss 0.05439576 - time (sec): 3.96 - samples/sec: 3233.41 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:40:15,756 epoch 5 - iter 72/121 - loss 0.05872498 - time (sec): 4.70 - samples/sec: 3267.51 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:40:16,427 epoch 5 - iter 84/121 - loss 0.05848788 - time (sec): 5.37 - samples/sec: 3280.41 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:40:17,153 epoch 5 - iter 96/121 - loss 0.05846039 - time (sec): 6.10 - samples/sec: 3272.26 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:40:17,891 epoch 5 - iter 108/121 - loss 0.05853485 - time (sec): 6.84 - samples/sec: 3269.18 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:40:18,638 epoch 5 - iter 120/121 - loss 0.05724472 - time (sec): 7.59 - samples/sec: 3250.89 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:40:18,684 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:18,685 EPOCH 5 done: loss 0.0571 - lr: 0.000017 2023-10-17 10:40:19,474 DEV : loss 0.15564198791980743 - f1-score (micro avg) 0.8015 2023-10-17 10:40:19,480 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:20,222 epoch 6 - iter 12/121 - loss 0.04969618 - time (sec): 0.74 - samples/sec: 3225.61 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:40:20,960 epoch 6 - iter 24/121 - loss 0.04644119 - time (sec): 1.48 - samples/sec: 3078.69 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:40:21,753 epoch 6 - iter 36/121 - loss 0.04209743 - time (sec): 2.27 - samples/sec: 3148.81 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:40:22,501 epoch 6 - iter 48/121 - loss 0.04711594 - time (sec): 3.02 - samples/sec: 3220.15 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:40:23,295 epoch 6 - iter 60/121 - loss 0.04764666 - time (sec): 3.81 - samples/sec: 3223.55 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:40:24,073 epoch 6 - iter 72/121 - loss 0.04477288 - time (sec): 4.59 - samples/sec: 3214.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:40:24,807 epoch 6 - iter 84/121 - loss 0.04169976 - time (sec): 5.33 - samples/sec: 3205.75 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:40:25,540 epoch 6 - iter 96/121 - loss 0.04232871 - time (sec): 6.06 - samples/sec: 3233.25 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:40:26,304 epoch 6 - iter 108/121 - loss 0.04135200 - time (sec): 6.82 - samples/sec: 3234.30 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:40:27,015 epoch 6 - iter 120/121 - loss 0.04338782 - time (sec): 7.53 - samples/sec: 3270.66 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:40:27,061 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:27,061 EPOCH 6 done: loss 0.0434 - lr: 0.000013 2023-10-17 10:40:27,828 DEV : loss 0.158291295170784 - f1-score (micro avg) 0.8204 2023-10-17 10:40:27,833 saving best model 2023-10-17 10:40:28,330 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:29,054 epoch 7 - iter 12/121 - loss 0.04068845 - time (sec): 0.72 - samples/sec: 3223.97 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:40:29,841 epoch 7 - iter 24/121 - loss 0.02967958 - time (sec): 1.51 - samples/sec: 3316.62 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:40:30,547 epoch 7 - iter 36/121 - loss 0.02930156 - time (sec): 2.21 - samples/sec: 3237.98 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:40:31,282 epoch 7 - iter 48/121 - loss 0.03417830 - time (sec): 2.95 - samples/sec: 3281.96 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:40:32,050 epoch 7 - iter 60/121 - loss 0.03460234 - time (sec): 3.72 - samples/sec: 3308.07 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:40:32,784 epoch 7 - iter 72/121 - loss 0.03483963 - time (sec): 4.45 - samples/sec: 3336.06 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:40:33,557 epoch 7 - iter 84/121 - loss 0.03363358 - time (sec): 5.22 - samples/sec: 3322.19 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:40:34,261 epoch 7 - iter 96/121 - loss 0.03264425 - time (sec): 5.93 - samples/sec: 3317.81 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:40:35,055 epoch 7 - iter 108/121 - loss 0.03194370 - time (sec): 6.72 - samples/sec: 3313.03 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:40:35,820 epoch 7 - iter 120/121 - loss 0.03311731 - time (sec): 7.49 - samples/sec: 3286.01 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:40:35,875 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:35,876 EPOCH 7 done: loss 0.0330 - lr: 0.000010 2023-10-17 10:40:36,669 DEV : loss 0.1770782619714737 - f1-score (micro avg) 0.8185 2023-10-17 10:40:36,681 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:37,429 epoch 8 - iter 12/121 - loss 0.02318625 - time (sec): 0.75 - samples/sec: 3466.20 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:40:38,254 epoch 8 - iter 24/121 - loss 0.02179879 - time (sec): 1.57 - samples/sec: 3154.01 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:40:38,978 epoch 8 - iter 36/121 - loss 0.01865314 - time (sec): 2.30 - samples/sec: 3244.27 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:40:39,799 epoch 8 - iter 48/121 - loss 0.02183121 - time (sec): 3.12 - samples/sec: 3222.49 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:40:40,533 epoch 8 - iter 60/121 - loss 0.02294907 - time (sec): 3.85 - samples/sec: 3239.56 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:40:41,312 epoch 8 - iter 72/121 - loss 0.02283248 - time (sec): 4.63 - samples/sec: 3244.31 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:40:42,039 epoch 8 - iter 84/121 - loss 0.02507082 - time (sec): 5.36 - samples/sec: 3259.64 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:40:42,799 epoch 8 - iter 96/121 - loss 0.02396940 - time (sec): 6.12 - samples/sec: 3244.75 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:40:43,575 epoch 8 - iter 108/121 - loss 0.02353728 - time (sec): 6.89 - samples/sec: 3224.57 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:40:44,343 epoch 8 - iter 120/121 - loss 0.02497422 - time (sec): 7.66 - samples/sec: 3211.10 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:40:44,397 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:44,398 EPOCH 8 done: loss 0.0249 - lr: 0.000007 2023-10-17 10:40:45,177 DEV : loss 0.19383026659488678 - f1-score (micro avg) 0.8306 2023-10-17 10:40:45,183 saving best model 2023-10-17 10:40:45,701 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:46,480 epoch 9 - iter 12/121 - loss 0.02257339 - time (sec): 0.77 - samples/sec: 3397.27 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:40:47,200 epoch 9 - iter 24/121 - loss 0.02788200 - time (sec): 1.48 - samples/sec: 3334.14 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:40:47,941 epoch 9 - iter 36/121 - loss 0.02369329 - time (sec): 2.23 - samples/sec: 3205.46 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:40:48,712 epoch 9 - iter 48/121 - loss 0.02520442 - time (sec): 3.00 - samples/sec: 3205.41 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:40:49,476 epoch 9 - iter 60/121 - loss 0.02227888 - time (sec): 3.76 - samples/sec: 3217.52 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:40:50,296 epoch 9 - iter 72/121 - loss 0.02140296 - time (sec): 4.58 - samples/sec: 3213.21 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:40:50,996 epoch 9 - iter 84/121 - loss 0.01988658 - time (sec): 5.28 - samples/sec: 3224.77 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:40:51,791 epoch 9 - iter 96/121 - loss 0.02156552 - time (sec): 6.08 - samples/sec: 3214.58 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:40:52,531 epoch 9 - iter 108/121 - loss 0.02013316 - time (sec): 6.82 - samples/sec: 3232.85 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:40:53,309 epoch 9 - iter 120/121 - loss 0.01851873 - time (sec): 7.59 - samples/sec: 3240.54 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:40:53,365 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:53,366 EPOCH 9 done: loss 0.0186 - lr: 0.000004 2023-10-17 10:40:54,229 DEV : loss 0.1929878145456314 - f1-score (micro avg) 0.8363 2023-10-17 10:40:54,235 saving best model 2023-10-17 10:40:54,758 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:40:55,485 epoch 10 - iter 12/121 - loss 0.03533969 - time (sec): 0.72 - samples/sec: 3361.11 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:40:56,257 epoch 10 - iter 24/121 - loss 0.03073965 - time (sec): 1.49 - samples/sec: 3256.62 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:40:57,027 epoch 10 - iter 36/121 - loss 0.02357388 - time (sec): 2.26 - samples/sec: 3159.50 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:40:57,804 epoch 10 - iter 48/121 - loss 0.02346600 - time (sec): 3.04 - samples/sec: 3225.02 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:40:58,572 epoch 10 - iter 60/121 - loss 0.02331925 - time (sec): 3.81 - samples/sec: 3251.15 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:40:59,365 epoch 10 - iter 72/121 - loss 0.02129171 - time (sec): 4.60 - samples/sec: 3191.76 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:41:00,107 epoch 10 - iter 84/121 - loss 0.02077126 - time (sec): 5.34 - samples/sec: 3216.04 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:41:00,884 epoch 10 - iter 96/121 - loss 0.01902530 - time (sec): 6.12 - samples/sec: 3199.96 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:41:01,678 epoch 10 - iter 108/121 - loss 0.01742089 - time (sec): 6.92 - samples/sec: 3187.95 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:41:02,493 epoch 10 - iter 120/121 - loss 0.01642575 - time (sec): 7.73 - samples/sec: 3181.68 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:41:02,541 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:41:02,541 EPOCH 10 done: loss 0.0164 - lr: 0.000000 2023-10-17 10:41:03,340 DEV : loss 0.19614964723587036 - f1-score (micro avg) 0.8416 2023-10-17 10:41:03,345 saving best model 2023-10-17 10:41:04,256 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:41:04,257 Loading model from best epoch ... 2023-10-17 10:41:05,657 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 10:41:06,551 Results: - F-score (micro) 0.8263 - F-score (macro) 0.563 - Accuracy 0.7225 By class: precision recall f1-score support pers 0.8531 0.8777 0.8652 139 scope 0.8444 0.8837 0.8636 129 work 0.7111 0.8000 0.7529 80 loc 0.6667 0.2222 0.3333 9 date 0.0000 0.0000 0.0000 3 micro avg 0.8140 0.8389 0.8263 360 macro avg 0.6151 0.5567 0.5630 360 weighted avg 0.8067 0.8389 0.8192 360 2023-10-17 10:41:06,552 ----------------------------------------------------------------------------------------------------