stefan-it's picture
Upload folder using huggingface_hub
1b8deea
raw
history blame
24.1 kB
2023-10-13 18:36:40,767 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Train: 5901 sentences
2023-10-13 18:36:40,768 (train_with_dev=False, train_with_test=False)
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Training Params:
2023-10-13 18:36:40,768 - learning_rate: "5e-05"
2023-10-13 18:36:40,768 - mini_batch_size: "8"
2023-10-13 18:36:40,768 - max_epochs: "10"
2023-10-13 18:36:40,768 - shuffle: "True"
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Plugins:
2023-10-13 18:36:40,768 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 18:36:40,768 - metric: "('micro avg', 'f1-score')"
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Computation:
2023-10-13 18:36:40,768 - compute on device: cuda:0
2023-10-13 18:36:40,768 - embedding storage: none
2023-10-13 18:36:40,768 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,768 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-13 18:36:40,769 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:40,769 ----------------------------------------------------------------------------------------------------
2023-10-13 18:36:45,338 epoch 1 - iter 73/738 - loss 2.54931084 - time (sec): 4.57 - samples/sec: 3537.61 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:36:50,041 epoch 1 - iter 146/738 - loss 1.59605592 - time (sec): 9.27 - samples/sec: 3504.76 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:36:54,788 epoch 1 - iter 219/738 - loss 1.19783690 - time (sec): 14.02 - samples/sec: 3467.00 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:36:59,439 epoch 1 - iter 292/738 - loss 0.98169682 - time (sec): 18.67 - samples/sec: 3456.39 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:37:04,570 epoch 1 - iter 365/738 - loss 0.84572983 - time (sec): 23.80 - samples/sec: 3410.98 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:37:09,103 epoch 1 - iter 438/738 - loss 0.75132701 - time (sec): 28.33 - samples/sec: 3402.43 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:37:14,016 epoch 1 - iter 511/738 - loss 0.67059184 - time (sec): 33.25 - samples/sec: 3422.44 - lr: 0.000035 - momentum: 0.000000
2023-10-13 18:37:19,055 epoch 1 - iter 584/738 - loss 0.60768668 - time (sec): 38.29 - samples/sec: 3429.10 - lr: 0.000039 - momentum: 0.000000
2023-10-13 18:37:23,839 epoch 1 - iter 657/738 - loss 0.56205620 - time (sec): 43.07 - samples/sec: 3426.96 - lr: 0.000044 - momentum: 0.000000
2023-10-13 18:37:29,004 epoch 1 - iter 730/738 - loss 0.52323394 - time (sec): 48.23 - samples/sec: 3413.91 - lr: 0.000049 - momentum: 0.000000
2023-10-13 18:37:29,580 ----------------------------------------------------------------------------------------------------
2023-10-13 18:37:29,580 EPOCH 1 done: loss 0.5197 - lr: 0.000049
2023-10-13 18:37:35,971 DEV : loss 0.12315742671489716 - f1-score (micro avg) 0.725
2023-10-13 18:37:36,008 saving best model
2023-10-13 18:37:36,393 ----------------------------------------------------------------------------------------------------
2023-10-13 18:37:42,332 epoch 2 - iter 73/738 - loss 0.12545050 - time (sec): 5.94 - samples/sec: 2810.64 - lr: 0.000049 - momentum: 0.000000
2023-10-13 18:37:47,119 epoch 2 - iter 146/738 - loss 0.13016972 - time (sec): 10.72 - samples/sec: 3078.41 - lr: 0.000049 - momentum: 0.000000
2023-10-13 18:37:52,148 epoch 2 - iter 219/738 - loss 0.12926236 - time (sec): 15.75 - samples/sec: 3155.66 - lr: 0.000048 - momentum: 0.000000
2023-10-13 18:37:56,732 epoch 2 - iter 292/738 - loss 0.12658168 - time (sec): 20.34 - samples/sec: 3203.09 - lr: 0.000048 - momentum: 0.000000
2023-10-13 18:38:01,818 epoch 2 - iter 365/738 - loss 0.12194424 - time (sec): 25.42 - samples/sec: 3270.82 - lr: 0.000047 - momentum: 0.000000
2023-10-13 18:38:08,055 epoch 2 - iter 438/738 - loss 0.12424182 - time (sec): 31.66 - samples/sec: 3280.65 - lr: 0.000047 - momentum: 0.000000
2023-10-13 18:38:12,544 epoch 2 - iter 511/738 - loss 0.12292217 - time (sec): 36.15 - samples/sec: 3293.17 - lr: 0.000046 - momentum: 0.000000
2023-10-13 18:38:17,596 epoch 2 - iter 584/738 - loss 0.12227691 - time (sec): 41.20 - samples/sec: 3299.69 - lr: 0.000046 - momentum: 0.000000
2023-10-13 18:38:21,808 epoch 2 - iter 657/738 - loss 0.12156959 - time (sec): 45.41 - samples/sec: 3310.25 - lr: 0.000045 - momentum: 0.000000
2023-10-13 18:38:26,383 epoch 2 - iter 730/738 - loss 0.12154467 - time (sec): 49.99 - samples/sec: 3300.38 - lr: 0.000045 - momentum: 0.000000
2023-10-13 18:38:26,800 ----------------------------------------------------------------------------------------------------
2023-10-13 18:38:26,800 EPOCH 2 done: loss 0.1208 - lr: 0.000045
2023-10-13 18:38:38,300 DEV : loss 0.12709642946720123 - f1-score (micro avg) 0.7786
2023-10-13 18:38:38,335 saving best model
2023-10-13 18:38:38,922 ----------------------------------------------------------------------------------------------------
2023-10-13 18:38:44,137 epoch 3 - iter 73/738 - loss 0.07100891 - time (sec): 5.21 - samples/sec: 3549.04 - lr: 0.000044 - momentum: 0.000000
2023-10-13 18:38:49,196 epoch 3 - iter 146/738 - loss 0.06897740 - time (sec): 10.27 - samples/sec: 3413.40 - lr: 0.000043 - momentum: 0.000000
2023-10-13 18:38:53,971 epoch 3 - iter 219/738 - loss 0.07112884 - time (sec): 15.04 - samples/sec: 3440.71 - lr: 0.000043 - momentum: 0.000000
2023-10-13 18:38:59,474 epoch 3 - iter 292/738 - loss 0.07710969 - time (sec): 20.55 - samples/sec: 3404.61 - lr: 0.000042 - momentum: 0.000000
2023-10-13 18:39:04,037 epoch 3 - iter 365/738 - loss 0.07954551 - time (sec): 25.11 - samples/sec: 3390.98 - lr: 0.000042 - momentum: 0.000000
2023-10-13 18:39:08,941 epoch 3 - iter 438/738 - loss 0.07609338 - time (sec): 30.01 - samples/sec: 3374.51 - lr: 0.000041 - momentum: 0.000000
2023-10-13 18:39:13,532 epoch 3 - iter 511/738 - loss 0.07547912 - time (sec): 34.60 - samples/sec: 3375.82 - lr: 0.000041 - momentum: 0.000000
2023-10-13 18:39:18,257 epoch 3 - iter 584/738 - loss 0.07423094 - time (sec): 39.33 - samples/sec: 3375.57 - lr: 0.000040 - momentum: 0.000000
2023-10-13 18:39:23,065 epoch 3 - iter 657/738 - loss 0.07474968 - time (sec): 44.14 - samples/sec: 3370.05 - lr: 0.000040 - momentum: 0.000000
2023-10-13 18:39:27,700 epoch 3 - iter 730/738 - loss 0.07410272 - time (sec): 48.77 - samples/sec: 3381.11 - lr: 0.000039 - momentum: 0.000000
2023-10-13 18:39:28,155 ----------------------------------------------------------------------------------------------------
2023-10-13 18:39:28,156 EPOCH 3 done: loss 0.0743 - lr: 0.000039
2023-10-13 18:39:39,626 DEV : loss 0.12267415970563889 - f1-score (micro avg) 0.8113
2023-10-13 18:39:39,657 saving best model
2023-10-13 18:39:40,149 ----------------------------------------------------------------------------------------------------
2023-10-13 18:39:45,061 epoch 4 - iter 73/738 - loss 0.04639953 - time (sec): 4.91 - samples/sec: 3252.59 - lr: 0.000038 - momentum: 0.000000
2023-10-13 18:39:50,759 epoch 4 - iter 146/738 - loss 0.05317424 - time (sec): 10.61 - samples/sec: 3345.90 - lr: 0.000038 - momentum: 0.000000
2023-10-13 18:39:55,789 epoch 4 - iter 219/738 - loss 0.05357739 - time (sec): 15.64 - samples/sec: 3291.31 - lr: 0.000037 - momentum: 0.000000
2023-10-13 18:40:00,304 epoch 4 - iter 292/738 - loss 0.05278044 - time (sec): 20.15 - samples/sec: 3277.58 - lr: 0.000037 - momentum: 0.000000
2023-10-13 18:40:05,274 epoch 4 - iter 365/738 - loss 0.05181301 - time (sec): 25.12 - samples/sec: 3295.30 - lr: 0.000036 - momentum: 0.000000
2023-10-13 18:40:09,947 epoch 4 - iter 438/738 - loss 0.05305014 - time (sec): 29.79 - samples/sec: 3318.60 - lr: 0.000036 - momentum: 0.000000
2023-10-13 18:40:14,570 epoch 4 - iter 511/738 - loss 0.05296526 - time (sec): 34.42 - samples/sec: 3310.78 - lr: 0.000035 - momentum: 0.000000
2023-10-13 18:40:19,155 epoch 4 - iter 584/738 - loss 0.05307194 - time (sec): 39.00 - samples/sec: 3325.60 - lr: 0.000035 - momentum: 0.000000
2023-10-13 18:40:24,038 epoch 4 - iter 657/738 - loss 0.05280718 - time (sec): 43.89 - samples/sec: 3317.54 - lr: 0.000034 - momentum: 0.000000
2023-10-13 18:40:29,545 epoch 4 - iter 730/738 - loss 0.05136550 - time (sec): 49.39 - samples/sec: 3335.13 - lr: 0.000033 - momentum: 0.000000
2023-10-13 18:40:30,006 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:30,006 EPOCH 4 done: loss 0.0520 - lr: 0.000033
2023-10-13 18:40:41,252 DEV : loss 0.16094207763671875 - f1-score (micro avg) 0.8146
2023-10-13 18:40:41,284 saving best model
2023-10-13 18:40:41,778 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:46,604 epoch 5 - iter 73/738 - loss 0.04101661 - time (sec): 4.82 - samples/sec: 3467.75 - lr: 0.000033 - momentum: 0.000000
2023-10-13 18:40:51,195 epoch 5 - iter 146/738 - loss 0.03612267 - time (sec): 9.42 - samples/sec: 3326.71 - lr: 0.000032 - momentum: 0.000000
2023-10-13 18:40:56,254 epoch 5 - iter 219/738 - loss 0.03677011 - time (sec): 14.47 - samples/sec: 3313.52 - lr: 0.000032 - momentum: 0.000000
2023-10-13 18:41:01,176 epoch 5 - iter 292/738 - loss 0.03501988 - time (sec): 19.40 - samples/sec: 3322.02 - lr: 0.000031 - momentum: 0.000000
2023-10-13 18:41:06,266 epoch 5 - iter 365/738 - loss 0.03519205 - time (sec): 24.49 - samples/sec: 3328.85 - lr: 0.000031 - momentum: 0.000000
2023-10-13 18:41:11,315 epoch 5 - iter 438/738 - loss 0.03597672 - time (sec): 29.54 - samples/sec: 3327.20 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:41:16,308 epoch 5 - iter 511/738 - loss 0.03576352 - time (sec): 34.53 - samples/sec: 3326.40 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:41:21,531 epoch 5 - iter 584/738 - loss 0.03576562 - time (sec): 39.75 - samples/sec: 3317.15 - lr: 0.000029 - momentum: 0.000000
2023-10-13 18:41:26,525 epoch 5 - iter 657/738 - loss 0.03587166 - time (sec): 44.75 - samples/sec: 3323.91 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:41:31,082 epoch 5 - iter 730/738 - loss 0.03565094 - time (sec): 49.30 - samples/sec: 3338.67 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:41:31,690 ----------------------------------------------------------------------------------------------------
2023-10-13 18:41:31,690 EPOCH 5 done: loss 0.0353 - lr: 0.000028
2023-10-13 18:41:42,876 DEV : loss 0.1758503019809723 - f1-score (micro avg) 0.8282
2023-10-13 18:41:42,905 saving best model
2023-10-13 18:41:43,492 ----------------------------------------------------------------------------------------------------
2023-10-13 18:41:48,507 epoch 6 - iter 73/738 - loss 0.02903248 - time (sec): 5.01 - samples/sec: 3384.93 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:41:53,294 epoch 6 - iter 146/738 - loss 0.02622059 - time (sec): 9.80 - samples/sec: 3291.20 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:41:58,719 epoch 6 - iter 219/738 - loss 0.02646847 - time (sec): 15.22 - samples/sec: 3178.14 - lr: 0.000026 - momentum: 0.000000
2023-10-13 18:42:03,631 epoch 6 - iter 292/738 - loss 0.02958010 - time (sec): 20.14 - samples/sec: 3207.56 - lr: 0.000026 - momentum: 0.000000
2023-10-13 18:42:08,390 epoch 6 - iter 365/738 - loss 0.02718745 - time (sec): 24.89 - samples/sec: 3219.83 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:42:12,874 epoch 6 - iter 438/738 - loss 0.02700128 - time (sec): 29.38 - samples/sec: 3244.77 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:42:18,097 epoch 6 - iter 511/738 - loss 0.02627538 - time (sec): 34.60 - samples/sec: 3267.81 - lr: 0.000024 - momentum: 0.000000
2023-10-13 18:42:22,935 epoch 6 - iter 584/738 - loss 0.02629923 - time (sec): 39.44 - samples/sec: 3279.67 - lr: 0.000023 - momentum: 0.000000
2023-10-13 18:42:27,917 epoch 6 - iter 657/738 - loss 0.02668968 - time (sec): 44.42 - samples/sec: 3282.99 - lr: 0.000023 - momentum: 0.000000
2023-10-13 18:42:32,910 epoch 6 - iter 730/738 - loss 0.02662988 - time (sec): 49.42 - samples/sec: 3320.82 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:42:33,632 ----------------------------------------------------------------------------------------------------
2023-10-13 18:42:33,633 EPOCH 6 done: loss 0.0264 - lr: 0.000022
2023-10-13 18:42:45,145 DEV : loss 0.18904562294483185 - f1-score (micro avg) 0.8241
2023-10-13 18:42:45,175 ----------------------------------------------------------------------------------------------------
2023-10-13 18:42:50,275 epoch 7 - iter 73/738 - loss 0.01959069 - time (sec): 5.10 - samples/sec: 3400.94 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:42:56,300 epoch 7 - iter 146/738 - loss 0.02030636 - time (sec): 11.12 - samples/sec: 3214.64 - lr: 0.000021 - momentum: 0.000000
2023-10-13 18:43:01,614 epoch 7 - iter 219/738 - loss 0.01749591 - time (sec): 16.44 - samples/sec: 3222.85 - lr: 0.000021 - momentum: 0.000000
2023-10-13 18:43:06,937 epoch 7 - iter 292/738 - loss 0.01940326 - time (sec): 21.76 - samples/sec: 3263.51 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:43:11,149 epoch 7 - iter 365/738 - loss 0.01847137 - time (sec): 25.97 - samples/sec: 3307.19 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:43:16,079 epoch 7 - iter 438/738 - loss 0.01934351 - time (sec): 30.90 - samples/sec: 3314.03 - lr: 0.000019 - momentum: 0.000000
2023-10-13 18:43:20,783 epoch 7 - iter 511/738 - loss 0.01938525 - time (sec): 35.61 - samples/sec: 3317.37 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:43:25,364 epoch 7 - iter 584/738 - loss 0.01849506 - time (sec): 40.19 - samples/sec: 3323.78 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:43:30,037 epoch 7 - iter 657/738 - loss 0.01814382 - time (sec): 44.86 - samples/sec: 3329.67 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:43:34,614 epoch 7 - iter 730/738 - loss 0.01783909 - time (sec): 49.44 - samples/sec: 3329.33 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:43:35,103 ----------------------------------------------------------------------------------------------------
2023-10-13 18:43:35,103 EPOCH 7 done: loss 0.0178 - lr: 0.000017
2023-10-13 18:43:46,346 DEV : loss 0.2096458524465561 - f1-score (micro avg) 0.8188
2023-10-13 18:43:46,376 ----------------------------------------------------------------------------------------------------
2023-10-13 18:43:51,287 epoch 8 - iter 73/738 - loss 0.01043596 - time (sec): 4.91 - samples/sec: 3589.86 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:43:55,837 epoch 8 - iter 146/738 - loss 0.00825281 - time (sec): 9.46 - samples/sec: 3471.96 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:44:01,171 epoch 8 - iter 219/738 - loss 0.01181636 - time (sec): 14.79 - samples/sec: 3487.90 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:44:06,152 epoch 8 - iter 292/738 - loss 0.01070298 - time (sec): 19.77 - samples/sec: 3387.84 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:44:10,412 epoch 8 - iter 365/738 - loss 0.01219064 - time (sec): 24.03 - samples/sec: 3387.86 - lr: 0.000014 - momentum: 0.000000
2023-10-13 18:44:15,244 epoch 8 - iter 438/738 - loss 0.01311692 - time (sec): 28.87 - samples/sec: 3369.74 - lr: 0.000013 - momentum: 0.000000
2023-10-13 18:44:20,242 epoch 8 - iter 511/738 - loss 0.01306631 - time (sec): 33.87 - samples/sec: 3387.73 - lr: 0.000013 - momentum: 0.000000
2023-10-13 18:44:25,159 epoch 8 - iter 584/738 - loss 0.01229285 - time (sec): 38.78 - samples/sec: 3375.92 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:44:30,088 epoch 8 - iter 657/738 - loss 0.01286131 - time (sec): 43.71 - samples/sec: 3366.15 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:44:35,102 epoch 8 - iter 730/738 - loss 0.01254249 - time (sec): 48.72 - samples/sec: 3367.31 - lr: 0.000011 - momentum: 0.000000
2023-10-13 18:44:35,798 ----------------------------------------------------------------------------------------------------
2023-10-13 18:44:35,798 EPOCH 8 done: loss 0.0125 - lr: 0.000011
2023-10-13 18:44:47,081 DEV : loss 0.2140151411294937 - f1-score (micro avg) 0.8264
2023-10-13 18:44:47,113 ----------------------------------------------------------------------------------------------------
2023-10-13 18:44:51,980 epoch 9 - iter 73/738 - loss 0.00976824 - time (sec): 4.87 - samples/sec: 3281.97 - lr: 0.000011 - momentum: 0.000000
2023-10-13 18:44:56,729 epoch 9 - iter 146/738 - loss 0.01142894 - time (sec): 9.61 - samples/sec: 3343.88 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:45:01,104 epoch 9 - iter 219/738 - loss 0.00946193 - time (sec): 13.99 - samples/sec: 3350.40 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:45:06,392 epoch 9 - iter 292/738 - loss 0.01046465 - time (sec): 19.28 - samples/sec: 3308.53 - lr: 0.000009 - momentum: 0.000000
2023-10-13 18:45:11,073 epoch 9 - iter 365/738 - loss 0.00966350 - time (sec): 23.96 - samples/sec: 3309.18 - lr: 0.000008 - momentum: 0.000000
2023-10-13 18:45:16,024 epoch 9 - iter 438/738 - loss 0.00924954 - time (sec): 28.91 - samples/sec: 3301.58 - lr: 0.000008 - momentum: 0.000000
2023-10-13 18:45:21,095 epoch 9 - iter 511/738 - loss 0.00925664 - time (sec): 33.98 - samples/sec: 3327.92 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:45:26,558 epoch 9 - iter 584/738 - loss 0.00935297 - time (sec): 39.44 - samples/sec: 3333.00 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:45:31,135 epoch 9 - iter 657/738 - loss 0.00875783 - time (sec): 44.02 - samples/sec: 3334.98 - lr: 0.000006 - momentum: 0.000000
2023-10-13 18:45:35,898 epoch 9 - iter 730/738 - loss 0.00930370 - time (sec): 48.78 - samples/sec: 3354.77 - lr: 0.000006 - momentum: 0.000000
2023-10-13 18:45:36,765 ----------------------------------------------------------------------------------------------------
2023-10-13 18:45:36,765 EPOCH 9 done: loss 0.0092 - lr: 0.000006
2023-10-13 18:45:47,968 DEV : loss 0.216547891497612 - f1-score (micro avg) 0.8287
2023-10-13 18:45:47,999 saving best model
2023-10-13 18:45:48,546 ----------------------------------------------------------------------------------------------------
2023-10-13 18:45:53,055 epoch 10 - iter 73/738 - loss 0.00309356 - time (sec): 4.50 - samples/sec: 3370.27 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:45:58,249 epoch 10 - iter 146/738 - loss 0.00338579 - time (sec): 9.70 - samples/sec: 3347.86 - lr: 0.000004 - momentum: 0.000000
2023-10-13 18:46:03,400 epoch 10 - iter 219/738 - loss 0.00387549 - time (sec): 14.85 - samples/sec: 3316.93 - lr: 0.000004 - momentum: 0.000000
2023-10-13 18:46:08,537 epoch 10 - iter 292/738 - loss 0.00458478 - time (sec): 19.99 - samples/sec: 3320.98 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:46:13,808 epoch 10 - iter 365/738 - loss 0.00487712 - time (sec): 25.26 - samples/sec: 3335.23 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:46:18,296 epoch 10 - iter 438/738 - loss 0.00479942 - time (sec): 29.75 - samples/sec: 3354.93 - lr: 0.000002 - momentum: 0.000000
2023-10-13 18:46:22,698 epoch 10 - iter 511/738 - loss 0.00567514 - time (sec): 34.15 - samples/sec: 3371.44 - lr: 0.000002 - momentum: 0.000000
2023-10-13 18:46:27,738 epoch 10 - iter 584/738 - loss 0.00588151 - time (sec): 39.19 - samples/sec: 3356.15 - lr: 0.000001 - momentum: 0.000000
2023-10-13 18:46:32,799 epoch 10 - iter 657/738 - loss 0.00591418 - time (sec): 44.25 - samples/sec: 3374.15 - lr: 0.000001 - momentum: 0.000000
2023-10-13 18:46:37,320 epoch 10 - iter 730/738 - loss 0.00582670 - time (sec): 48.77 - samples/sec: 3379.83 - lr: 0.000000 - momentum: 0.000000
2023-10-13 18:46:37,769 ----------------------------------------------------------------------------------------------------
2023-10-13 18:46:37,769 EPOCH 10 done: loss 0.0058 - lr: 0.000000
2023-10-13 18:46:49,686 DEV : loss 0.22645464539527893 - f1-score (micro avg) 0.8308
2023-10-13 18:46:49,715 saving best model
2023-10-13 18:46:50,693 ----------------------------------------------------------------------------------------------------
2023-10-13 18:46:50,695 Loading model from best epoch ...
2023-10-13 18:46:52,055 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 18:46:58,097
Results:
- F-score (micro) 0.7976
- F-score (macro) 0.7
- Accuracy 0.6875
By class:
precision recall f1-score support
loc 0.8567 0.8776 0.8670 858
pers 0.7557 0.8063 0.7802 537
org 0.5423 0.5833 0.5620 132
prod 0.6923 0.7377 0.7143 61
time 0.5312 0.6296 0.5763 54
micro avg 0.7789 0.8173 0.7976 1642
macro avg 0.6756 0.7269 0.7000 1642
weighted avg 0.7815 0.8173 0.7989 1642
2023-10-13 18:46:58,097 ----------------------------------------------------------------------------------------------------