stefan-it's picture
Upload folder using huggingface_hub
81c9ad9
2023-10-17 09:46:43,097 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,098 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 09:46:43,098 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,098 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-17 09:46:43,099 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,099 Train: 1214 sentences
2023-10-17 09:46:43,099 (train_with_dev=False, train_with_test=False)
2023-10-17 09:46:43,099 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,099 Training Params:
2023-10-17 09:46:43,099 - learning_rate: "3e-05"
2023-10-17 09:46:43,099 - mini_batch_size: "8"
2023-10-17 09:46:43,099 - max_epochs: "10"
2023-10-17 09:46:43,099 - shuffle: "True"
2023-10-17 09:46:43,099 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,099 Plugins:
2023-10-17 09:46:43,099 - TensorboardLogger
2023-10-17 09:46:43,099 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 09:46:43,099 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,099 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 09:46:43,099 - metric: "('micro avg', 'f1-score')"
2023-10-17 09:46:43,099 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,100 Computation:
2023-10-17 09:46:43,100 - compute on device: cuda:0
2023-10-17 09:46:43,100 - embedding storage: none
2023-10-17 09:46:43,100 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,100 Model training base path: "hmbench-ajmc/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 09:46:43,100 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,100 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:43,100 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 09:46:43,959 epoch 1 - iter 15/152 - loss 4.08709480 - time (sec): 0.86 - samples/sec: 3453.45 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:46:44,872 epoch 1 - iter 30/152 - loss 3.67349628 - time (sec): 1.77 - samples/sec: 3469.52 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:46:45,776 epoch 1 - iter 45/152 - loss 2.98257228 - time (sec): 2.67 - samples/sec: 3487.86 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:46:46,625 epoch 1 - iter 60/152 - loss 2.43261488 - time (sec): 3.52 - samples/sec: 3470.07 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:46:47,505 epoch 1 - iter 75/152 - loss 2.07662538 - time (sec): 4.40 - samples/sec: 3528.10 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:46:48,321 epoch 1 - iter 90/152 - loss 1.85236036 - time (sec): 5.22 - samples/sec: 3504.06 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:46:49,206 epoch 1 - iter 105/152 - loss 1.66941765 - time (sec): 6.10 - samples/sec: 3488.69 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:46:50,046 epoch 1 - iter 120/152 - loss 1.51562782 - time (sec): 6.94 - samples/sec: 3505.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:46:50,908 epoch 1 - iter 135/152 - loss 1.39014384 - time (sec): 7.81 - samples/sec: 3523.69 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:46:51,799 epoch 1 - iter 150/152 - loss 1.27593710 - time (sec): 8.70 - samples/sec: 3530.79 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:46:51,913 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:51,913 EPOCH 1 done: loss 1.2662 - lr: 0.000029
2023-10-17 09:46:52,828 DEV : loss 0.23923853039741516 - f1-score (micro avg) 0.5092
2023-10-17 09:46:52,835 saving best model
2023-10-17 09:46:53,170 ----------------------------------------------------------------------------------------------------
2023-10-17 09:46:54,038 epoch 2 - iter 15/152 - loss 0.25968914 - time (sec): 0.87 - samples/sec: 3534.88 - lr: 0.000030 - momentum: 0.000000
2023-10-17 09:46:54,955 epoch 2 - iter 30/152 - loss 0.22279190 - time (sec): 1.78 - samples/sec: 3502.97 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:46:55,883 epoch 2 - iter 45/152 - loss 0.21763067 - time (sec): 2.71 - samples/sec: 3365.72 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:46:56,778 epoch 2 - iter 60/152 - loss 0.21582908 - time (sec): 3.61 - samples/sec: 3366.66 - lr: 0.000029 - momentum: 0.000000
2023-10-17 09:46:57,651 epoch 2 - iter 75/152 - loss 0.20207098 - time (sec): 4.48 - samples/sec: 3414.93 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:46:58,492 epoch 2 - iter 90/152 - loss 0.19077238 - time (sec): 5.32 - samples/sec: 3434.00 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:46:59,339 epoch 2 - iter 105/152 - loss 0.17992756 - time (sec): 6.17 - samples/sec: 3449.70 - lr: 0.000028 - momentum: 0.000000
2023-10-17 09:47:00,207 epoch 2 - iter 120/152 - loss 0.17275352 - time (sec): 7.04 - samples/sec: 3467.53 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:47:01,059 epoch 2 - iter 135/152 - loss 0.17265650 - time (sec): 7.89 - samples/sec: 3503.04 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:47:01,892 epoch 2 - iter 150/152 - loss 0.17032603 - time (sec): 8.72 - samples/sec: 3520.36 - lr: 0.000027 - momentum: 0.000000
2023-10-17 09:47:01,988 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:01,988 EPOCH 2 done: loss 0.1692 - lr: 0.000027
2023-10-17 09:47:02,937 DEV : loss 0.12957707047462463 - f1-score (micro avg) 0.7696
2023-10-17 09:47:02,944 saving best model
2023-10-17 09:47:03,404 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:04,236 epoch 3 - iter 15/152 - loss 0.11222872 - time (sec): 0.83 - samples/sec: 3805.52 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:47:05,077 epoch 3 - iter 30/152 - loss 0.10389302 - time (sec): 1.67 - samples/sec: 3593.03 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:47:05,894 epoch 3 - iter 45/152 - loss 0.10041671 - time (sec): 2.49 - samples/sec: 3604.24 - lr: 0.000026 - momentum: 0.000000
2023-10-17 09:47:06,752 epoch 3 - iter 60/152 - loss 0.09597802 - time (sec): 3.35 - samples/sec: 3565.52 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:47:07,632 epoch 3 - iter 75/152 - loss 0.10191719 - time (sec): 4.23 - samples/sec: 3539.25 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:47:08,493 epoch 3 - iter 90/152 - loss 0.10150171 - time (sec): 5.09 - samples/sec: 3546.28 - lr: 0.000025 - momentum: 0.000000
2023-10-17 09:47:09,349 epoch 3 - iter 105/152 - loss 0.09625097 - time (sec): 5.94 - samples/sec: 3565.65 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:47:10,243 epoch 3 - iter 120/152 - loss 0.09520497 - time (sec): 6.84 - samples/sec: 3600.78 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:47:11,064 epoch 3 - iter 135/152 - loss 0.09136958 - time (sec): 7.66 - samples/sec: 3623.79 - lr: 0.000024 - momentum: 0.000000
2023-10-17 09:47:11,924 epoch 3 - iter 150/152 - loss 0.09165308 - time (sec): 8.52 - samples/sec: 3604.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:47:12,031 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:12,031 EPOCH 3 done: loss 0.0928 - lr: 0.000023
2023-10-17 09:47:12,981 DEV : loss 0.1297578513622284 - f1-score (micro avg) 0.8223
2023-10-17 09:47:12,988 saving best model
2023-10-17 09:47:13,500 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:14,297 epoch 4 - iter 15/152 - loss 0.04986111 - time (sec): 0.79 - samples/sec: 3513.29 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:47:15,108 epoch 4 - iter 30/152 - loss 0.06192583 - time (sec): 1.61 - samples/sec: 3453.16 - lr: 0.000023 - momentum: 0.000000
2023-10-17 09:47:15,949 epoch 4 - iter 45/152 - loss 0.06310085 - time (sec): 2.45 - samples/sec: 3442.15 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:47:16,807 epoch 4 - iter 60/152 - loss 0.05987327 - time (sec): 3.31 - samples/sec: 3434.64 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:47:17,655 epoch 4 - iter 75/152 - loss 0.06234632 - time (sec): 4.15 - samples/sec: 3493.64 - lr: 0.000022 - momentum: 0.000000
2023-10-17 09:47:18,530 epoch 4 - iter 90/152 - loss 0.06373864 - time (sec): 5.03 - samples/sec: 3538.26 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:47:19,363 epoch 4 - iter 105/152 - loss 0.06595611 - time (sec): 5.86 - samples/sec: 3564.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:47:20,231 epoch 4 - iter 120/152 - loss 0.06558915 - time (sec): 6.73 - samples/sec: 3574.10 - lr: 0.000021 - momentum: 0.000000
2023-10-17 09:47:21,136 epoch 4 - iter 135/152 - loss 0.06663958 - time (sec): 7.63 - samples/sec: 3575.29 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:47:22,058 epoch 4 - iter 150/152 - loss 0.06251869 - time (sec): 8.56 - samples/sec: 3576.16 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:47:22,163 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:22,164 EPOCH 4 done: loss 0.0631 - lr: 0.000020
2023-10-17 09:47:23,133 DEV : loss 0.1361662745475769 - f1-score (micro avg) 0.8489
2023-10-17 09:47:23,140 saving best model
2023-10-17 09:47:23,624 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:24,515 epoch 5 - iter 15/152 - loss 0.02755188 - time (sec): 0.89 - samples/sec: 3563.62 - lr: 0.000020 - momentum: 0.000000
2023-10-17 09:47:25,340 epoch 5 - iter 30/152 - loss 0.05347785 - time (sec): 1.71 - samples/sec: 3458.92 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:47:26,187 epoch 5 - iter 45/152 - loss 0.05330478 - time (sec): 2.56 - samples/sec: 3507.33 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:47:27,011 epoch 5 - iter 60/152 - loss 0.05452557 - time (sec): 3.39 - samples/sec: 3502.84 - lr: 0.000019 - momentum: 0.000000
2023-10-17 09:47:27,829 epoch 5 - iter 75/152 - loss 0.05130820 - time (sec): 4.20 - samples/sec: 3498.76 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:47:28,697 epoch 5 - iter 90/152 - loss 0.05010256 - time (sec): 5.07 - samples/sec: 3590.48 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:47:29,623 epoch 5 - iter 105/152 - loss 0.04542425 - time (sec): 6.00 - samples/sec: 3590.37 - lr: 0.000018 - momentum: 0.000000
2023-10-17 09:47:30,536 epoch 5 - iter 120/152 - loss 0.04294284 - time (sec): 6.91 - samples/sec: 3549.85 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:47:31,399 epoch 5 - iter 135/152 - loss 0.04426878 - time (sec): 7.77 - samples/sec: 3532.31 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:47:32,276 epoch 5 - iter 150/152 - loss 0.04471386 - time (sec): 8.65 - samples/sec: 3538.31 - lr: 0.000017 - momentum: 0.000000
2023-10-17 09:47:32,391 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:32,391 EPOCH 5 done: loss 0.0453 - lr: 0.000017
2023-10-17 09:47:33,354 DEV : loss 0.1634262502193451 - f1-score (micro avg) 0.8416
2023-10-17 09:47:33,361 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:34,309 epoch 6 - iter 15/152 - loss 0.02365755 - time (sec): 0.95 - samples/sec: 3485.90 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:47:35,243 epoch 6 - iter 30/152 - loss 0.03444613 - time (sec): 1.88 - samples/sec: 3533.94 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:47:36,111 epoch 6 - iter 45/152 - loss 0.03248390 - time (sec): 2.75 - samples/sec: 3539.09 - lr: 0.000016 - momentum: 0.000000
2023-10-17 09:47:36,966 epoch 6 - iter 60/152 - loss 0.02712965 - time (sec): 3.60 - samples/sec: 3551.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:47:37,834 epoch 6 - iter 75/152 - loss 0.02826301 - time (sec): 4.47 - samples/sec: 3539.72 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:47:38,671 epoch 6 - iter 90/152 - loss 0.03537780 - time (sec): 5.31 - samples/sec: 3493.33 - lr: 0.000015 - momentum: 0.000000
2023-10-17 09:47:39,488 epoch 6 - iter 105/152 - loss 0.03787211 - time (sec): 6.13 - samples/sec: 3487.84 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:47:40,346 epoch 6 - iter 120/152 - loss 0.03866626 - time (sec): 6.98 - samples/sec: 3490.52 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:47:41,241 epoch 6 - iter 135/152 - loss 0.03845673 - time (sec): 7.88 - samples/sec: 3480.34 - lr: 0.000014 - momentum: 0.000000
2023-10-17 09:47:42,099 epoch 6 - iter 150/152 - loss 0.03701677 - time (sec): 8.74 - samples/sec: 3507.30 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:47:42,196 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:42,196 EPOCH 6 done: loss 0.0367 - lr: 0.000013
2023-10-17 09:47:43,228 DEV : loss 0.1642225831747055 - f1-score (micro avg) 0.8565
2023-10-17 09:47:43,235 saving best model
2023-10-17 09:47:43,723 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:44,589 epoch 7 - iter 15/152 - loss 0.04256990 - time (sec): 0.86 - samples/sec: 3311.66 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:47:45,408 epoch 7 - iter 30/152 - loss 0.03497680 - time (sec): 1.68 - samples/sec: 3455.63 - lr: 0.000013 - momentum: 0.000000
2023-10-17 09:47:46,308 epoch 7 - iter 45/152 - loss 0.03729344 - time (sec): 2.58 - samples/sec: 3502.72 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:47:47,172 epoch 7 - iter 60/152 - loss 0.03052896 - time (sec): 3.45 - samples/sec: 3484.81 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:47:48,031 epoch 7 - iter 75/152 - loss 0.02753474 - time (sec): 4.31 - samples/sec: 3500.61 - lr: 0.000012 - momentum: 0.000000
2023-10-17 09:47:48,900 epoch 7 - iter 90/152 - loss 0.02898334 - time (sec): 5.18 - samples/sec: 3449.95 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:47:49,794 epoch 7 - iter 105/152 - loss 0.02789337 - time (sec): 6.07 - samples/sec: 3447.00 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:47:50,693 epoch 7 - iter 120/152 - loss 0.02796994 - time (sec): 6.97 - samples/sec: 3455.20 - lr: 0.000011 - momentum: 0.000000
2023-10-17 09:47:51,613 epoch 7 - iter 135/152 - loss 0.02876490 - time (sec): 7.89 - samples/sec: 3463.24 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:47:52,484 epoch 7 - iter 150/152 - loss 0.02832453 - time (sec): 8.76 - samples/sec: 3499.54 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:47:52,593 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:52,593 EPOCH 7 done: loss 0.0280 - lr: 0.000010
2023-10-17 09:47:53,732 DEV : loss 0.18156136572360992 - f1-score (micro avg) 0.8605
2023-10-17 09:47:53,739 saving best model
2023-10-17 09:47:54,201 ----------------------------------------------------------------------------------------------------
2023-10-17 09:47:55,053 epoch 8 - iter 15/152 - loss 0.06044905 - time (sec): 0.85 - samples/sec: 3469.12 - lr: 0.000010 - momentum: 0.000000
2023-10-17 09:47:55,916 epoch 8 - iter 30/152 - loss 0.03014943 - time (sec): 1.71 - samples/sec: 3474.85 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:47:56,804 epoch 8 - iter 45/152 - loss 0.02352535 - time (sec): 2.60 - samples/sec: 3544.30 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:47:57,630 epoch 8 - iter 60/152 - loss 0.02298262 - time (sec): 3.43 - samples/sec: 3537.91 - lr: 0.000009 - momentum: 0.000000
2023-10-17 09:47:58,467 epoch 8 - iter 75/152 - loss 0.02611197 - time (sec): 4.26 - samples/sec: 3542.55 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:47:59,376 epoch 8 - iter 90/152 - loss 0.02299230 - time (sec): 5.17 - samples/sec: 3547.10 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:48:00,198 epoch 8 - iter 105/152 - loss 0.02172982 - time (sec): 6.00 - samples/sec: 3544.12 - lr: 0.000008 - momentum: 0.000000
2023-10-17 09:48:01,086 epoch 8 - iter 120/152 - loss 0.02326433 - time (sec): 6.88 - samples/sec: 3575.70 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:48:01,916 epoch 8 - iter 135/152 - loss 0.02337614 - time (sec): 7.71 - samples/sec: 3581.30 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:48:02,790 epoch 8 - iter 150/152 - loss 0.02314189 - time (sec): 8.59 - samples/sec: 3568.10 - lr: 0.000007 - momentum: 0.000000
2023-10-17 09:48:02,890 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:02,890 EPOCH 8 done: loss 0.0236 - lr: 0.000007
2023-10-17 09:48:03,876 DEV : loss 0.1801178753376007 - f1-score (micro avg) 0.8501
2023-10-17 09:48:03,883 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:04,750 epoch 9 - iter 15/152 - loss 0.01430670 - time (sec): 0.87 - samples/sec: 3286.70 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:48:05,628 epoch 9 - iter 30/152 - loss 0.01715728 - time (sec): 1.74 - samples/sec: 3406.95 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:48:06,509 epoch 9 - iter 45/152 - loss 0.02458273 - time (sec): 2.62 - samples/sec: 3420.35 - lr: 0.000006 - momentum: 0.000000
2023-10-17 09:48:07,393 epoch 9 - iter 60/152 - loss 0.02244915 - time (sec): 3.51 - samples/sec: 3491.25 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:48:08,282 epoch 9 - iter 75/152 - loss 0.01954133 - time (sec): 4.40 - samples/sec: 3471.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:48:09,152 epoch 9 - iter 90/152 - loss 0.02153946 - time (sec): 5.27 - samples/sec: 3459.86 - lr: 0.000005 - momentum: 0.000000
2023-10-17 09:48:10,000 epoch 9 - iter 105/152 - loss 0.01973313 - time (sec): 6.12 - samples/sec: 3442.79 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:10,883 epoch 9 - iter 120/152 - loss 0.01775200 - time (sec): 7.00 - samples/sec: 3425.38 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:11,818 epoch 9 - iter 135/152 - loss 0.01978596 - time (sec): 7.93 - samples/sec: 3458.04 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:12,655 epoch 9 - iter 150/152 - loss 0.01852494 - time (sec): 8.77 - samples/sec: 3478.16 - lr: 0.000004 - momentum: 0.000000
2023-10-17 09:48:12,774 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:12,774 EPOCH 9 done: loss 0.0190 - lr: 0.000004
2023-10-17 09:48:13,727 DEV : loss 0.18825717270374298 - f1-score (micro avg) 0.8698
2023-10-17 09:48:13,733 saving best model
2023-10-17 09:48:14,211 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:15,017 epoch 10 - iter 15/152 - loss 0.01308198 - time (sec): 0.80 - samples/sec: 3750.13 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:48:15,889 epoch 10 - iter 30/152 - loss 0.00736048 - time (sec): 1.68 - samples/sec: 3675.34 - lr: 0.000003 - momentum: 0.000000
2023-10-17 09:48:16,733 epoch 10 - iter 45/152 - loss 0.01711075 - time (sec): 2.52 - samples/sec: 3714.84 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:48:17,592 epoch 10 - iter 60/152 - loss 0.02056631 - time (sec): 3.38 - samples/sec: 3611.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:48:18,466 epoch 10 - iter 75/152 - loss 0.01835657 - time (sec): 4.25 - samples/sec: 3611.82 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:48:19,342 epoch 10 - iter 90/152 - loss 0.01965530 - time (sec): 5.13 - samples/sec: 3555.02 - lr: 0.000002 - momentum: 0.000000
2023-10-17 09:48:20,242 epoch 10 - iter 105/152 - loss 0.01854851 - time (sec): 6.03 - samples/sec: 3536.19 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:48:21,094 epoch 10 - iter 120/152 - loss 0.01829723 - time (sec): 6.88 - samples/sec: 3564.57 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:48:21,946 epoch 10 - iter 135/152 - loss 0.01684338 - time (sec): 7.73 - samples/sec: 3534.19 - lr: 0.000001 - momentum: 0.000000
2023-10-17 09:48:22,843 epoch 10 - iter 150/152 - loss 0.01616166 - time (sec): 8.63 - samples/sec: 3544.29 - lr: 0.000000 - momentum: 0.000000
2023-10-17 09:48:22,961 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:22,961 EPOCH 10 done: loss 0.0160 - lr: 0.000000
2023-10-17 09:48:23,944 DEV : loss 0.18469107151031494 - f1-score (micro avg) 0.8629
2023-10-17 09:48:24,317 ----------------------------------------------------------------------------------------------------
2023-10-17 09:48:24,320 Loading model from best epoch ...
2023-10-17 09:48:25,886 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-17 09:48:26,924
Results:
- F-score (micro) 0.8292
- F-score (macro) 0.642
- Accuracy 0.7133
By class:
precision recall f1-score support
scope 0.7730 0.8344 0.8025 151
work 0.7685 0.8737 0.8177 95
pers 0.9091 0.9375 0.9231 96
date 0.0000 0.0000 0.0000 3
loc 0.6667 0.6667 0.6667 3
micro avg 0.7963 0.8649 0.8292 348
macro avg 0.6235 0.6625 0.6420 348
weighted avg 0.8017 0.8649 0.8319 348
2023-10-17 09:48:26,924 ----------------------------------------------------------------------------------------------------