stefan-it's picture
Upload folder using huggingface_hub
545d882
2023-10-17 08:25:45,131 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,132 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 08:25:45,132 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,132 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-17 08:25:45,132 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,132 Train: 1100 sentences
2023-10-17 08:25:45,133 (train_with_dev=False, train_with_test=False)
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 Training Params:
2023-10-17 08:25:45,133 - learning_rate: "5e-05"
2023-10-17 08:25:45,133 - mini_batch_size: "8"
2023-10-17 08:25:45,133 - max_epochs: "10"
2023-10-17 08:25:45,133 - shuffle: "True"
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 Plugins:
2023-10-17 08:25:45,133 - TensorboardLogger
2023-10-17 08:25:45,133 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 08:25:45,133 - metric: "('micro avg', 'f1-score')"
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 Computation:
2023-10-17 08:25:45,133 - compute on device: cuda:0
2023-10-17 08:25:45,133 - embedding storage: none
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:45,133 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 08:25:45,848 epoch 1 - iter 13/138 - loss 3.43157067 - time (sec): 0.71 - samples/sec: 2929.63 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:25:46,565 epoch 1 - iter 26/138 - loss 2.94341200 - time (sec): 1.43 - samples/sec: 2874.16 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:25:47,299 epoch 1 - iter 39/138 - loss 2.44187050 - time (sec): 2.17 - samples/sec: 2856.13 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:25:48,033 epoch 1 - iter 52/138 - loss 1.99375963 - time (sec): 2.90 - samples/sec: 2884.28 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:25:48,817 epoch 1 - iter 65/138 - loss 1.67766635 - time (sec): 3.68 - samples/sec: 2904.62 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:25:49,568 epoch 1 - iter 78/138 - loss 1.47320680 - time (sec): 4.43 - samples/sec: 2945.81 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:25:50,281 epoch 1 - iter 91/138 - loss 1.34446452 - time (sec): 5.15 - samples/sec: 2939.89 - lr: 0.000033 - momentum: 0.000000
2023-10-17 08:25:51,011 epoch 1 - iter 104/138 - loss 1.21983278 - time (sec): 5.88 - samples/sec: 2920.15 - lr: 0.000037 - momentum: 0.000000
2023-10-17 08:25:51,739 epoch 1 - iter 117/138 - loss 1.12117619 - time (sec): 6.60 - samples/sec: 2922.93 - lr: 0.000042 - momentum: 0.000000
2023-10-17 08:25:52,492 epoch 1 - iter 130/138 - loss 1.04463454 - time (sec): 7.36 - samples/sec: 2913.32 - lr: 0.000047 - momentum: 0.000000
2023-10-17 08:25:52,958 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:52,959 EPOCH 1 done: loss 1.0032 - lr: 0.000047
2023-10-17 08:25:53,496 DEV : loss 0.24205686151981354 - f1-score (micro avg) 0.6482
2023-10-17 08:25:53,501 saving best model
2023-10-17 08:25:53,850 ----------------------------------------------------------------------------------------------------
2023-10-17 08:25:54,620 epoch 2 - iter 13/138 - loss 0.27043098 - time (sec): 0.77 - samples/sec: 3000.22 - lr: 0.000050 - momentum: 0.000000
2023-10-17 08:25:55,375 epoch 2 - iter 26/138 - loss 0.24273143 - time (sec): 1.52 - samples/sec: 2867.69 - lr: 0.000049 - momentum: 0.000000
2023-10-17 08:25:56,141 epoch 2 - iter 39/138 - loss 0.25085266 - time (sec): 2.29 - samples/sec: 2912.01 - lr: 0.000048 - momentum: 0.000000
2023-10-17 08:25:56,839 epoch 2 - iter 52/138 - loss 0.24140883 - time (sec): 2.99 - samples/sec: 2964.32 - lr: 0.000048 - momentum: 0.000000
2023-10-17 08:25:57,552 epoch 2 - iter 65/138 - loss 0.22835302 - time (sec): 3.70 - samples/sec: 2914.71 - lr: 0.000047 - momentum: 0.000000
2023-10-17 08:25:58,287 epoch 2 - iter 78/138 - loss 0.22322428 - time (sec): 4.43 - samples/sec: 2942.73 - lr: 0.000047 - momentum: 0.000000
2023-10-17 08:25:59,003 epoch 2 - iter 91/138 - loss 0.21610097 - time (sec): 5.15 - samples/sec: 2928.90 - lr: 0.000046 - momentum: 0.000000
2023-10-17 08:25:59,721 epoch 2 - iter 104/138 - loss 0.20830588 - time (sec): 5.87 - samples/sec: 2900.25 - lr: 0.000046 - momentum: 0.000000
2023-10-17 08:26:00,475 epoch 2 - iter 117/138 - loss 0.20383405 - time (sec): 6.62 - samples/sec: 2896.09 - lr: 0.000045 - momentum: 0.000000
2023-10-17 08:26:01,219 epoch 2 - iter 130/138 - loss 0.19579670 - time (sec): 7.37 - samples/sec: 2918.89 - lr: 0.000045 - momentum: 0.000000
2023-10-17 08:26:01,632 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:01,632 EPOCH 2 done: loss 0.1884 - lr: 0.000045
2023-10-17 08:26:02,266 DEV : loss 0.1432858556509018 - f1-score (micro avg) 0.81
2023-10-17 08:26:02,270 saving best model
2023-10-17 08:26:02,711 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:03,446 epoch 3 - iter 13/138 - loss 0.09662708 - time (sec): 0.73 - samples/sec: 2790.06 - lr: 0.000044 - momentum: 0.000000
2023-10-17 08:26:04,204 epoch 3 - iter 26/138 - loss 0.09741620 - time (sec): 1.49 - samples/sec: 2984.74 - lr: 0.000043 - momentum: 0.000000
2023-10-17 08:26:04,938 epoch 3 - iter 39/138 - loss 0.08527103 - time (sec): 2.22 - samples/sec: 2877.89 - lr: 0.000043 - momentum: 0.000000
2023-10-17 08:26:05,629 epoch 3 - iter 52/138 - loss 0.08554042 - time (sec): 2.91 - samples/sec: 2862.09 - lr: 0.000042 - momentum: 0.000000
2023-10-17 08:26:06,459 epoch 3 - iter 65/138 - loss 0.08704104 - time (sec): 3.74 - samples/sec: 2854.23 - lr: 0.000042 - momentum: 0.000000
2023-10-17 08:26:07,158 epoch 3 - iter 78/138 - loss 0.08765933 - time (sec): 4.44 - samples/sec: 2849.52 - lr: 0.000041 - momentum: 0.000000
2023-10-17 08:26:07,904 epoch 3 - iter 91/138 - loss 0.08846124 - time (sec): 5.19 - samples/sec: 2895.29 - lr: 0.000041 - momentum: 0.000000
2023-10-17 08:26:08,635 epoch 3 - iter 104/138 - loss 0.09460890 - time (sec): 5.92 - samples/sec: 2910.53 - lr: 0.000040 - momentum: 0.000000
2023-10-17 08:26:09,355 epoch 3 - iter 117/138 - loss 0.09452451 - time (sec): 6.64 - samples/sec: 2913.27 - lr: 0.000040 - momentum: 0.000000
2023-10-17 08:26:10,070 epoch 3 - iter 130/138 - loss 0.10364161 - time (sec): 7.35 - samples/sec: 2935.57 - lr: 0.000039 - momentum: 0.000000
2023-10-17 08:26:10,523 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:10,524 EPOCH 3 done: loss 0.1024 - lr: 0.000039
2023-10-17 08:26:11,150 DEV : loss 0.12657766044139862 - f1-score (micro avg) 0.8361
2023-10-17 08:26:11,155 saving best model
2023-10-17 08:26:11,582 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:12,355 epoch 4 - iter 13/138 - loss 0.06164321 - time (sec): 0.77 - samples/sec: 3073.48 - lr: 0.000038 - momentum: 0.000000
2023-10-17 08:26:13,056 epoch 4 - iter 26/138 - loss 0.06900307 - time (sec): 1.47 - samples/sec: 2954.64 - lr: 0.000038 - momentum: 0.000000
2023-10-17 08:26:13,791 epoch 4 - iter 39/138 - loss 0.06737054 - time (sec): 2.21 - samples/sec: 2937.82 - lr: 0.000037 - momentum: 0.000000
2023-10-17 08:26:14,545 epoch 4 - iter 52/138 - loss 0.07217805 - time (sec): 2.96 - samples/sec: 2878.16 - lr: 0.000037 - momentum: 0.000000
2023-10-17 08:26:15,299 epoch 4 - iter 65/138 - loss 0.08353830 - time (sec): 3.72 - samples/sec: 2937.82 - lr: 0.000036 - momentum: 0.000000
2023-10-17 08:26:16,032 epoch 4 - iter 78/138 - loss 0.08015663 - time (sec): 4.45 - samples/sec: 2898.27 - lr: 0.000036 - momentum: 0.000000
2023-10-17 08:26:16,795 epoch 4 - iter 91/138 - loss 0.08026446 - time (sec): 5.21 - samples/sec: 2882.55 - lr: 0.000035 - momentum: 0.000000
2023-10-17 08:26:17,553 epoch 4 - iter 104/138 - loss 0.07646328 - time (sec): 5.97 - samples/sec: 2892.23 - lr: 0.000035 - momentum: 0.000000
2023-10-17 08:26:18,316 epoch 4 - iter 117/138 - loss 0.07885204 - time (sec): 6.73 - samples/sec: 2889.55 - lr: 0.000034 - momentum: 0.000000
2023-10-17 08:26:19,036 epoch 4 - iter 130/138 - loss 0.07560251 - time (sec): 7.45 - samples/sec: 2876.21 - lr: 0.000034 - momentum: 0.000000
2023-10-17 08:26:19,522 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:19,522 EPOCH 4 done: loss 0.0749 - lr: 0.000034
2023-10-17 08:26:20,300 DEV : loss 0.1538553535938263 - f1-score (micro avg) 0.8603
2023-10-17 08:26:20,304 saving best model
2023-10-17 08:26:20,751 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:21,519 epoch 5 - iter 13/138 - loss 0.08129829 - time (sec): 0.76 - samples/sec: 2871.69 - lr: 0.000033 - momentum: 0.000000
2023-10-17 08:26:22,261 epoch 5 - iter 26/138 - loss 0.07047005 - time (sec): 1.51 - samples/sec: 2959.86 - lr: 0.000032 - momentum: 0.000000
2023-10-17 08:26:23,009 epoch 5 - iter 39/138 - loss 0.05637991 - time (sec): 2.25 - samples/sec: 2912.22 - lr: 0.000032 - momentum: 0.000000
2023-10-17 08:26:23,713 epoch 5 - iter 52/138 - loss 0.06461053 - time (sec): 2.96 - samples/sec: 2932.41 - lr: 0.000031 - momentum: 0.000000
2023-10-17 08:26:24,440 epoch 5 - iter 65/138 - loss 0.05885473 - time (sec): 3.68 - samples/sec: 2890.02 - lr: 0.000031 - momentum: 0.000000
2023-10-17 08:26:25,168 epoch 5 - iter 78/138 - loss 0.05439754 - time (sec): 4.41 - samples/sec: 2858.89 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:26:25,890 epoch 5 - iter 91/138 - loss 0.05002054 - time (sec): 5.13 - samples/sec: 2894.96 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:26:26,677 epoch 5 - iter 104/138 - loss 0.04951814 - time (sec): 5.92 - samples/sec: 2896.48 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:26:27,422 epoch 5 - iter 117/138 - loss 0.05822267 - time (sec): 6.67 - samples/sec: 2926.58 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:26:28,160 epoch 5 - iter 130/138 - loss 0.05859372 - time (sec): 7.40 - samples/sec: 2912.45 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:26:28,641 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:28,641 EPOCH 5 done: loss 0.0567 - lr: 0.000028
2023-10-17 08:26:29,293 DEV : loss 0.1780824512243271 - f1-score (micro avg) 0.8585
2023-10-17 08:26:29,298 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:30,026 epoch 6 - iter 13/138 - loss 0.01633802 - time (sec): 0.73 - samples/sec: 2852.60 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:26:30,810 epoch 6 - iter 26/138 - loss 0.04230440 - time (sec): 1.51 - samples/sec: 3087.62 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:26:31,517 epoch 6 - iter 39/138 - loss 0.03934797 - time (sec): 2.22 - samples/sec: 3064.13 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:26:32,220 epoch 6 - iter 52/138 - loss 0.03589065 - time (sec): 2.92 - samples/sec: 3006.27 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:26:32,987 epoch 6 - iter 65/138 - loss 0.03127002 - time (sec): 3.69 - samples/sec: 2954.78 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:26:33,761 epoch 6 - iter 78/138 - loss 0.03359051 - time (sec): 4.46 - samples/sec: 2937.72 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:26:34,495 epoch 6 - iter 91/138 - loss 0.03844091 - time (sec): 5.20 - samples/sec: 2929.76 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:26:35,251 epoch 6 - iter 104/138 - loss 0.04316186 - time (sec): 5.95 - samples/sec: 2900.75 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:26:35,972 epoch 6 - iter 117/138 - loss 0.04013628 - time (sec): 6.67 - samples/sec: 2892.73 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:26:36,732 epoch 6 - iter 130/138 - loss 0.03952624 - time (sec): 7.43 - samples/sec: 2887.53 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:26:37,182 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:37,182 EPOCH 6 done: loss 0.0465 - lr: 0.000023
2023-10-17 08:26:37,817 DEV : loss 0.15540580451488495 - f1-score (micro avg) 0.878
2023-10-17 08:26:37,821 saving best model
2023-10-17 08:26:38,247 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:38,961 epoch 7 - iter 13/138 - loss 0.01397906 - time (sec): 0.71 - samples/sec: 2701.78 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:26:39,684 epoch 7 - iter 26/138 - loss 0.02628629 - time (sec): 1.43 - samples/sec: 2811.79 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:26:40,426 epoch 7 - iter 39/138 - loss 0.04371837 - time (sec): 2.18 - samples/sec: 2824.49 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:26:41,229 epoch 7 - iter 52/138 - loss 0.03915947 - time (sec): 2.98 - samples/sec: 2767.46 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:26:41,999 epoch 7 - iter 65/138 - loss 0.04067551 - time (sec): 3.75 - samples/sec: 2832.11 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:26:42,768 epoch 7 - iter 78/138 - loss 0.03483673 - time (sec): 4.52 - samples/sec: 2803.84 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:26:43,487 epoch 7 - iter 91/138 - loss 0.03369841 - time (sec): 5.24 - samples/sec: 2815.66 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:26:44,217 epoch 7 - iter 104/138 - loss 0.03193150 - time (sec): 5.97 - samples/sec: 2838.02 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:26:45,044 epoch 7 - iter 117/138 - loss 0.03238749 - time (sec): 6.79 - samples/sec: 2837.96 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:26:45,778 epoch 7 - iter 130/138 - loss 0.03219471 - time (sec): 7.53 - samples/sec: 2849.44 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:26:46,304 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:46,305 EPOCH 7 done: loss 0.0327 - lr: 0.000017
2023-10-17 08:26:46,939 DEV : loss 0.1837640106678009 - f1-score (micro avg) 0.8723
2023-10-17 08:26:46,943 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:47,694 epoch 8 - iter 13/138 - loss 0.03074508 - time (sec): 0.75 - samples/sec: 2800.68 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:26:48,456 epoch 8 - iter 26/138 - loss 0.03711762 - time (sec): 1.51 - samples/sec: 2885.28 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:26:49,183 epoch 8 - iter 39/138 - loss 0.03737419 - time (sec): 2.24 - samples/sec: 2923.03 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:26:49,928 epoch 8 - iter 52/138 - loss 0.03129664 - time (sec): 2.98 - samples/sec: 2923.94 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:26:50,719 epoch 8 - iter 65/138 - loss 0.02973725 - time (sec): 3.78 - samples/sec: 2904.04 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:26:51,464 epoch 8 - iter 78/138 - loss 0.02665196 - time (sec): 4.52 - samples/sec: 2865.84 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:26:52,233 epoch 8 - iter 91/138 - loss 0.02400992 - time (sec): 5.29 - samples/sec: 2882.20 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:26:52,997 epoch 8 - iter 104/138 - loss 0.02313313 - time (sec): 6.05 - samples/sec: 2852.01 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:26:53,736 epoch 8 - iter 117/138 - loss 0.02525509 - time (sec): 6.79 - samples/sec: 2864.69 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:26:54,476 epoch 8 - iter 130/138 - loss 0.02647068 - time (sec): 7.53 - samples/sec: 2872.09 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:26:54,891 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:54,891 EPOCH 8 done: loss 0.0254 - lr: 0.000012
2023-10-17 08:26:55,522 DEV : loss 0.1877746433019638 - f1-score (micro avg) 0.8886
2023-10-17 08:26:55,527 saving best model
2023-10-17 08:26:55,982 ----------------------------------------------------------------------------------------------------
2023-10-17 08:26:56,735 epoch 9 - iter 13/138 - loss 0.03739812 - time (sec): 0.75 - samples/sec: 2778.28 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:26:57,511 epoch 9 - iter 26/138 - loss 0.03150704 - time (sec): 1.52 - samples/sec: 2646.10 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:26:58,254 epoch 9 - iter 39/138 - loss 0.02277717 - time (sec): 2.27 - samples/sec: 2725.40 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:26:59,043 epoch 9 - iter 52/138 - loss 0.02272279 - time (sec): 3.06 - samples/sec: 2739.97 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:26:59,809 epoch 9 - iter 65/138 - loss 0.02141050 - time (sec): 3.82 - samples/sec: 2755.69 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:27:00,593 epoch 9 - iter 78/138 - loss 0.01807415 - time (sec): 4.61 - samples/sec: 2814.59 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:27:01,320 epoch 9 - iter 91/138 - loss 0.01752412 - time (sec): 5.33 - samples/sec: 2842.18 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:27:02,012 epoch 9 - iter 104/138 - loss 0.01795085 - time (sec): 6.02 - samples/sec: 2815.64 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:27:02,736 epoch 9 - iter 117/138 - loss 0.01734781 - time (sec): 6.75 - samples/sec: 2805.11 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:27:03,488 epoch 9 - iter 130/138 - loss 0.01818842 - time (sec): 7.50 - samples/sec: 2821.79 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:27:03,951 ----------------------------------------------------------------------------------------------------
2023-10-17 08:27:03,951 EPOCH 9 done: loss 0.0195 - lr: 0.000006
2023-10-17 08:27:04,597 DEV : loss 0.1937594711780548 - f1-score (micro avg) 0.891
2023-10-17 08:27:04,602 saving best model
2023-10-17 08:27:05,034 ----------------------------------------------------------------------------------------------------
2023-10-17 08:27:05,765 epoch 10 - iter 13/138 - loss 0.01784379 - time (sec): 0.73 - samples/sec: 2849.89 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:27:06,512 epoch 10 - iter 26/138 - loss 0.00971714 - time (sec): 1.48 - samples/sec: 2742.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:27:07,254 epoch 10 - iter 39/138 - loss 0.01334067 - time (sec): 2.22 - samples/sec: 2773.75 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:27:08,064 epoch 10 - iter 52/138 - loss 0.01038007 - time (sec): 3.03 - samples/sec: 2718.39 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:27:08,829 epoch 10 - iter 65/138 - loss 0.00834679 - time (sec): 3.79 - samples/sec: 2764.06 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:27:09,585 epoch 10 - iter 78/138 - loss 0.01155802 - time (sec): 4.55 - samples/sec: 2772.08 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:27:10,365 epoch 10 - iter 91/138 - loss 0.01175069 - time (sec): 5.33 - samples/sec: 2802.48 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:27:11,119 epoch 10 - iter 104/138 - loss 0.01383309 - time (sec): 6.08 - samples/sec: 2813.68 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:27:11,847 epoch 10 - iter 117/138 - loss 0.01436897 - time (sec): 6.81 - samples/sec: 2843.50 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:27:12,582 epoch 10 - iter 130/138 - loss 0.01313075 - time (sec): 7.55 - samples/sec: 2871.93 - lr: 0.000000 - momentum: 0.000000
2023-10-17 08:27:13,015 ----------------------------------------------------------------------------------------------------
2023-10-17 08:27:13,016 EPOCH 10 done: loss 0.0135 - lr: 0.000000
2023-10-17 08:27:13,658 DEV : loss 0.19636619091033936 - f1-score (micro avg) 0.8948
2023-10-17 08:27:13,663 saving best model
2023-10-17 08:27:14,448 ----------------------------------------------------------------------------------------------------
2023-10-17 08:27:14,449 Loading model from best epoch ...
2023-10-17 08:27:15,783 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 08:27:16,610
Results:
- F-score (micro) 0.9146
- F-score (macro) 0.8801
- Accuracy 0.8509
By class:
precision recall f1-score support
scope 0.8920 0.8920 0.8920 176
pers 0.9690 0.9766 0.9728 128
work 0.8873 0.8514 0.8690 74
object 1.0000 1.0000 1.0000 2
loc 1.0000 0.5000 0.6667 2
micro avg 0.9182 0.9110 0.9146 382
macro avg 0.9497 0.8440 0.8801 382
weighted avg 0.9180 0.9110 0.9140 382
2023-10-17 08:27:16,611 ----------------------------------------------------------------------------------------------------