2023-10-17 10:28:15,149 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,150 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:28:15,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,150 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 10:28:15,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,150 Train: 966 sentences 2023-10-17 10:28:15,150 (train_with_dev=False, train_with_test=False) 2023-10-17 10:28:15,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,150 Training Params: 2023-10-17 10:28:15,150 - learning_rate: "5e-05" 2023-10-17 10:28:15,150 - mini_batch_size: "4" 2023-10-17 10:28:15,150 - max_epochs: "10" 2023-10-17 10:28:15,150 - shuffle: "True" 2023-10-17 10:28:15,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,150 Plugins: 2023-10-17 10:28:15,150 - TensorboardLogger 2023-10-17 10:28:15,150 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:28:15,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,150 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:28:15,150 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:28:15,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,151 Computation: 2023-10-17 10:28:15,151 - compute on device: cuda:0 2023-10-17 10:28:15,151 - embedding storage: none 2023-10-17 10:28:15,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,151 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 10:28:15,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:15,151 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:28:16,279 epoch 1 - iter 24/242 - loss 3.25970479 - time (sec): 1.13 - samples/sec: 2063.85 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:28:17,447 epoch 1 - iter 48/242 - loss 2.50860117 - time (sec): 2.30 - samples/sec: 1947.03 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:28:18,562 epoch 1 - iter 72/242 - loss 1.92429534 - time (sec): 3.41 - samples/sec: 2006.21 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:28:19,648 epoch 1 - iter 96/242 - loss 1.52557452 - time (sec): 4.50 - samples/sec: 2089.80 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:28:20,720 epoch 1 - iter 120/242 - loss 1.27333302 - time (sec): 5.57 - samples/sec: 2157.73 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:28:21,807 epoch 1 - iter 144/242 - loss 1.11924984 - time (sec): 6.66 - samples/sec: 2166.86 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:28:22,892 epoch 1 - iter 168/242 - loss 0.99530680 - time (sec): 7.74 - samples/sec: 2181.30 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:28:24,020 epoch 1 - iter 192/242 - loss 0.89272964 - time (sec): 8.87 - samples/sec: 2197.17 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:28:25,127 epoch 1 - iter 216/242 - loss 0.82638206 - time (sec): 9.98 - samples/sec: 2200.20 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:28:26,251 epoch 1 - iter 240/242 - loss 0.75901859 - time (sec): 11.10 - samples/sec: 2214.62 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:28:26,356 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:26,356 EPOCH 1 done: loss 0.7554 - lr: 0.000049 2023-10-17 10:28:27,017 DEV : loss 0.21835866570472717 - f1-score (micro avg) 0.6432 2023-10-17 10:28:27,023 saving best model 2023-10-17 10:28:27,425 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:28,566 epoch 2 - iter 24/242 - loss 0.18598405 - time (sec): 1.14 - samples/sec: 2161.39 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:28:29,653 epoch 2 - iter 48/242 - loss 0.19121188 - time (sec): 2.23 - samples/sec: 2045.28 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:28:30,734 epoch 2 - iter 72/242 - loss 0.19609755 - time (sec): 3.31 - samples/sec: 2099.14 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:28:31,825 epoch 2 - iter 96/242 - loss 0.18728732 - time (sec): 4.40 - samples/sec: 2210.09 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:28:32,932 epoch 2 - iter 120/242 - loss 0.18546693 - time (sec): 5.51 - samples/sec: 2226.80 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:28:34,012 epoch 2 - iter 144/242 - loss 0.18286089 - time (sec): 6.59 - samples/sec: 2249.58 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:28:35,103 epoch 2 - iter 168/242 - loss 0.17929003 - time (sec): 7.68 - samples/sec: 2234.40 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:28:36,240 epoch 2 - iter 192/242 - loss 0.31890305 - time (sec): 8.81 - samples/sec: 2238.04 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:28:37,348 epoch 2 - iter 216/242 - loss 0.33430272 - time (sec): 9.92 - samples/sec: 2240.19 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:28:38,429 epoch 2 - iter 240/242 - loss 0.32417209 - time (sec): 11.00 - samples/sec: 2232.49 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:28:38,518 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:38,518 EPOCH 2 done: loss 0.3220 - lr: 0.000045 2023-10-17 10:28:39,559 DEV : loss 0.15037748217582703 - f1-score (micro avg) 0.7771 2023-10-17 10:28:39,565 saving best model 2023-10-17 10:28:40,090 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:41,163 epoch 3 - iter 24/242 - loss 0.15274512 - time (sec): 1.07 - samples/sec: 2337.86 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:28:42,240 epoch 3 - iter 48/242 - loss 0.12344920 - time (sec): 2.15 - samples/sec: 2267.01 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:28:43,344 epoch 3 - iter 72/242 - loss 0.11944212 - time (sec): 3.25 - samples/sec: 2329.47 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:28:44,398 epoch 3 - iter 96/242 - loss 0.12216499 - time (sec): 4.30 - samples/sec: 2334.27 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:28:45,458 epoch 3 - iter 120/242 - loss 0.12282298 - time (sec): 5.36 - samples/sec: 2309.92 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:28:46,528 epoch 3 - iter 144/242 - loss 0.12138986 - time (sec): 6.43 - samples/sec: 2357.94 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:28:47,618 epoch 3 - iter 168/242 - loss 0.13616462 - time (sec): 7.52 - samples/sec: 2296.42 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:28:48,708 epoch 3 - iter 192/242 - loss 0.13661181 - time (sec): 8.61 - samples/sec: 2286.65 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:28:49,871 epoch 3 - iter 216/242 - loss 0.13181716 - time (sec): 9.78 - samples/sec: 2292.64 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:28:51,006 epoch 3 - iter 240/242 - loss 0.13090406 - time (sec): 10.91 - samples/sec: 2261.09 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:28:51,098 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:51,098 EPOCH 3 done: loss 0.1313 - lr: 0.000039 2023-10-17 10:28:51,891 DEV : loss 0.2135360836982727 - f1-score (micro avg) 0.77 2023-10-17 10:28:51,898 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:53,066 epoch 4 - iter 24/242 - loss 0.09863727 - time (sec): 1.17 - samples/sec: 2166.11 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:28:54,238 epoch 4 - iter 48/242 - loss 0.10087220 - time (sec): 2.34 - samples/sec: 2100.80 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:28:55,416 epoch 4 - iter 72/242 - loss 0.11188105 - time (sec): 3.52 - samples/sec: 2113.22 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:28:56,544 epoch 4 - iter 96/242 - loss 0.11486984 - time (sec): 4.64 - samples/sec: 2088.90 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:28:57,665 epoch 4 - iter 120/242 - loss 0.10845387 - time (sec): 5.77 - samples/sec: 2156.20 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:28:58,834 epoch 4 - iter 144/242 - loss 0.09860065 - time (sec): 6.93 - samples/sec: 2151.57 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:28:59,970 epoch 4 - iter 168/242 - loss 0.09788769 - time (sec): 8.07 - samples/sec: 2159.35 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:29:01,069 epoch 4 - iter 192/242 - loss 0.10604095 - time (sec): 9.17 - samples/sec: 2162.97 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:29:02,262 epoch 4 - iter 216/242 - loss 0.10330150 - time (sec): 10.36 - samples/sec: 2136.98 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:29:03,475 epoch 4 - iter 240/242 - loss 0.09964278 - time (sec): 11.58 - samples/sec: 2121.12 - lr: 0.000033 - momentum: 0.000000 2023-10-17 10:29:03,569 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:03,570 EPOCH 4 done: loss 0.0989 - lr: 0.000033 2023-10-17 10:29:04,335 DEV : loss 0.18611398339271545 - f1-score (micro avg) 0.8166 2023-10-17 10:29:04,341 saving best model 2023-10-17 10:29:04,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:06,031 epoch 5 - iter 24/242 - loss 0.08934265 - time (sec): 1.16 - samples/sec: 2414.22 - lr: 0.000033 - momentum: 0.000000 2023-10-17 10:29:07,119 epoch 5 - iter 48/242 - loss 0.08049125 - time (sec): 2.24 - samples/sec: 2206.73 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:29:08,297 epoch 5 - iter 72/242 - loss 0.08373737 - time (sec): 3.42 - samples/sec: 2198.09 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:29:09,438 epoch 5 - iter 96/242 - loss 0.07656306 - time (sec): 4.56 - samples/sec: 2181.26 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:29:10,540 epoch 5 - iter 120/242 - loss 0.07462864 - time (sec): 5.67 - samples/sec: 2180.41 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:29:11,664 epoch 5 - iter 144/242 - loss 0.07536521 - time (sec): 6.79 - samples/sec: 2181.94 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:29:12,839 epoch 5 - iter 168/242 - loss 0.07596334 - time (sec): 7.96 - samples/sec: 2193.93 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:29:14,011 epoch 5 - iter 192/242 - loss 0.07702994 - time (sec): 9.14 - samples/sec: 2152.67 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:29:15,147 epoch 5 - iter 216/242 - loss 0.07728869 - time (sec): 10.27 - samples/sec: 2150.90 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:29:16,268 epoch 5 - iter 240/242 - loss 0.08078917 - time (sec): 11.39 - samples/sec: 2155.23 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:29:16,364 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:16,364 EPOCH 5 done: loss 0.0806 - lr: 0.000028 2023-10-17 10:29:17,145 DEV : loss 0.18981605768203735 - f1-score (micro avg) 0.831 2023-10-17 10:29:17,150 saving best model 2023-10-17 10:29:17,669 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:18,760 epoch 6 - iter 24/242 - loss 0.02943104 - time (sec): 1.09 - samples/sec: 2443.20 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:29:19,872 epoch 6 - iter 48/242 - loss 0.04261874 - time (sec): 2.20 - samples/sec: 2335.53 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:29:20,971 epoch 6 - iter 72/242 - loss 0.04340975 - time (sec): 3.30 - samples/sec: 2329.31 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:29:22,066 epoch 6 - iter 96/242 - loss 0.04101219 - time (sec): 4.39 - samples/sec: 2260.52 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:29:23,214 epoch 6 - iter 120/242 - loss 0.03815515 - time (sec): 5.54 - samples/sec: 2239.10 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:29:24,319 epoch 6 - iter 144/242 - loss 0.04478318 - time (sec): 6.64 - samples/sec: 2236.35 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:29:25,418 epoch 6 - iter 168/242 - loss 0.04544838 - time (sec): 7.74 - samples/sec: 2228.93 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:29:26,560 epoch 6 - iter 192/242 - loss 0.04835983 - time (sec): 8.89 - samples/sec: 2225.48 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:29:27,678 epoch 6 - iter 216/242 - loss 0.05363040 - time (sec): 10.00 - samples/sec: 2214.38 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:29:28,846 epoch 6 - iter 240/242 - loss 0.05668504 - time (sec): 11.17 - samples/sec: 2203.54 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:29:28,945 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:28,945 EPOCH 6 done: loss 0.0575 - lr: 0.000022 2023-10-17 10:29:29,761 DEV : loss 0.1939019411802292 - f1-score (micro avg) 0.8304 2023-10-17 10:29:29,766 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:30,955 epoch 7 - iter 24/242 - loss 0.05934802 - time (sec): 1.19 - samples/sec: 2221.48 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:29:32,138 epoch 7 - iter 48/242 - loss 0.05294271 - time (sec): 2.37 - samples/sec: 2129.09 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:29:33,319 epoch 7 - iter 72/242 - loss 0.04909491 - time (sec): 3.55 - samples/sec: 2118.84 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:29:34,519 epoch 7 - iter 96/242 - loss 0.05141294 - time (sec): 4.75 - samples/sec: 2093.45 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:29:35,727 epoch 7 - iter 120/242 - loss 0.04623379 - time (sec): 5.96 - samples/sec: 2124.66 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:29:36,845 epoch 7 - iter 144/242 - loss 0.04483115 - time (sec): 7.08 - samples/sec: 2135.90 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:29:37,972 epoch 7 - iter 168/242 - loss 0.04401257 - time (sec): 8.20 - samples/sec: 2141.72 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:29:39,065 epoch 7 - iter 192/242 - loss 0.04200145 - time (sec): 9.30 - samples/sec: 2116.98 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:29:40,164 epoch 7 - iter 216/242 - loss 0.04473825 - time (sec): 10.40 - samples/sec: 2132.91 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:29:41,272 epoch 7 - iter 240/242 - loss 0.04496126 - time (sec): 11.50 - samples/sec: 2133.84 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:29:41,364 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:41,364 EPOCH 7 done: loss 0.0447 - lr: 0.000017 2023-10-17 10:29:42,125 DEV : loss 0.22301127016544342 - f1-score (micro avg) 0.8398 2023-10-17 10:29:42,130 saving best model 2023-10-17 10:29:42,688 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:43,809 epoch 8 - iter 24/242 - loss 0.04066135 - time (sec): 1.12 - samples/sec: 2112.77 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:29:44,900 epoch 8 - iter 48/242 - loss 0.03299735 - time (sec): 2.21 - samples/sec: 2184.80 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:29:46,068 epoch 8 - iter 72/242 - loss 0.03419853 - time (sec): 3.38 - samples/sec: 2193.05 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:29:47,235 epoch 8 - iter 96/242 - loss 0.03274503 - time (sec): 4.54 - samples/sec: 2206.72 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:29:48,330 epoch 8 - iter 120/242 - loss 0.03355638 - time (sec): 5.64 - samples/sec: 2225.37 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:29:49,433 epoch 8 - iter 144/242 - loss 0.03113597 - time (sec): 6.74 - samples/sec: 2242.72 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:29:50,536 epoch 8 - iter 168/242 - loss 0.02998637 - time (sec): 7.85 - samples/sec: 2230.13 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:29:51,663 epoch 8 - iter 192/242 - loss 0.03244036 - time (sec): 8.97 - samples/sec: 2185.12 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:29:52,811 epoch 8 - iter 216/242 - loss 0.03079493 - time (sec): 10.12 - samples/sec: 2219.45 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:29:53,900 epoch 8 - iter 240/242 - loss 0.03018910 - time (sec): 11.21 - samples/sec: 2194.34 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:29:53,986 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:53,987 EPOCH 8 done: loss 0.0300 - lr: 0.000011 2023-10-17 10:29:54,762 DEV : loss 0.23514655232429504 - f1-score (micro avg) 0.8275 2023-10-17 10:29:54,768 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:29:55,888 epoch 9 - iter 24/242 - loss 0.02392462 - time (sec): 1.12 - samples/sec: 2082.32 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:29:56,997 epoch 9 - iter 48/242 - loss 0.01877040 - time (sec): 2.23 - samples/sec: 2044.62 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:29:58,107 epoch 9 - iter 72/242 - loss 0.02213220 - time (sec): 3.34 - samples/sec: 2134.66 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:29:59,251 epoch 9 - iter 96/242 - loss 0.02009617 - time (sec): 4.48 - samples/sec: 2173.26 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:30:00,355 epoch 9 - iter 120/242 - loss 0.02487369 - time (sec): 5.59 - samples/sec: 2158.96 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:30:01,483 epoch 9 - iter 144/242 - loss 0.02431532 - time (sec): 6.71 - samples/sec: 2169.69 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:30:02,568 epoch 9 - iter 168/242 - loss 0.02585720 - time (sec): 7.80 - samples/sec: 2165.52 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:30:03,645 epoch 9 - iter 192/242 - loss 0.02605376 - time (sec): 8.88 - samples/sec: 2196.00 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:30:04,745 epoch 9 - iter 216/242 - loss 0.02578923 - time (sec): 9.98 - samples/sec: 2210.45 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:30:05,855 epoch 9 - iter 240/242 - loss 0.02598808 - time (sec): 11.09 - samples/sec: 2214.24 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:30:05,945 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:05,945 EPOCH 9 done: loss 0.0260 - lr: 0.000006 2023-10-17 10:30:06,757 DEV : loss 0.2330138087272644 - f1-score (micro avg) 0.8331 2023-10-17 10:30:06,763 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:07,852 epoch 10 - iter 24/242 - loss 0.00804413 - time (sec): 1.09 - samples/sec: 2175.04 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:30:08,960 epoch 10 - iter 48/242 - loss 0.01919974 - time (sec): 2.20 - samples/sec: 2236.83 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:30:10,064 epoch 10 - iter 72/242 - loss 0.01965478 - time (sec): 3.30 - samples/sec: 2226.16 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:30:11,194 epoch 10 - iter 96/242 - loss 0.01576570 - time (sec): 4.43 - samples/sec: 2236.06 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:30:12,388 epoch 10 - iter 120/242 - loss 0.01606595 - time (sec): 5.62 - samples/sec: 2216.72 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:30:13,514 epoch 10 - iter 144/242 - loss 0.01895275 - time (sec): 6.75 - samples/sec: 2208.36 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:30:14,621 epoch 10 - iter 168/242 - loss 0.02026875 - time (sec): 7.86 - samples/sec: 2209.21 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:30:15,756 epoch 10 - iter 192/242 - loss 0.01984338 - time (sec): 8.99 - samples/sec: 2225.21 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:30:16,841 epoch 10 - iter 216/242 - loss 0.02098018 - time (sec): 10.08 - samples/sec: 2221.52 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:30:17,937 epoch 10 - iter 240/242 - loss 0.02035977 - time (sec): 11.17 - samples/sec: 2206.88 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:30:18,037 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:18,037 EPOCH 10 done: loss 0.0203 - lr: 0.000000 2023-10-17 10:30:18,833 DEV : loss 0.23592014610767365 - f1-score (micro avg) 0.8342 2023-10-17 10:30:19,267 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:30:19,268 Loading model from best epoch ... 2023-10-17 10:30:20,717 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 10:30:21,583 Results: - F-score (micro) 0.8076 - F-score (macro) 0.5753 - Accuracy 0.6948 By class: precision recall f1-score support pers 0.8345 0.8705 0.8521 139 scope 0.8222 0.8605 0.8409 129 work 0.6854 0.7625 0.7219 80 loc 0.7500 0.3333 0.4615 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7936 0.8222 0.8076 360 macro avg 0.6184 0.5654 0.5753 360 weighted avg 0.7879 0.8222 0.8023 360 2023-10-17 10:30:21,583 ----------------------------------------------------------------------------------------------------