2023-10-17 08:51:37,528 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,529 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:51:37,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,529 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:51:37,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,529 Train: 1100 sentences 2023-10-17 08:51:37,529 (train_with_dev=False, train_with_test=False) 2023-10-17 08:51:37,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,529 Training Params: 2023-10-17 08:51:37,529 - learning_rate: "3e-05" 2023-10-17 08:51:37,529 - mini_batch_size: "8" 2023-10-17 08:51:37,529 - max_epochs: "10" 2023-10-17 08:51:37,530 - shuffle: "True" 2023-10-17 08:51:37,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,530 Plugins: 2023-10-17 08:51:37,530 - TensorboardLogger 2023-10-17 08:51:37,530 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:51:37,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,530 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:51:37,530 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:51:37,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,530 Computation: 2023-10-17 08:51:37,530 - compute on device: cuda:0 2023-10-17 08:51:37,530 - embedding storage: none 2023-10-17 08:51:37,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,530 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 08:51:37,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:37,530 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:51:38,256 epoch 1 - iter 13/138 - loss 4.13929230 - time (sec): 0.73 - samples/sec: 3067.00 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:51:38,992 epoch 1 - iter 26/138 - loss 3.79718513 - time (sec): 1.46 - samples/sec: 2780.95 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:51:39,747 epoch 1 - iter 39/138 - loss 3.22703788 - time (sec): 2.22 - samples/sec: 2845.77 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:51:40,483 epoch 1 - iter 52/138 - loss 2.69778465 - time (sec): 2.95 - samples/sec: 2856.07 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:51:41,241 epoch 1 - iter 65/138 - loss 2.30426441 - time (sec): 3.71 - samples/sec: 2840.43 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:51:41,979 epoch 1 - iter 78/138 - loss 2.02821589 - time (sec): 4.45 - samples/sec: 2837.96 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:51:42,732 epoch 1 - iter 91/138 - loss 1.79103008 - time (sec): 5.20 - samples/sec: 2891.79 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:51:43,477 epoch 1 - iter 104/138 - loss 1.61118784 - time (sec): 5.95 - samples/sec: 2952.08 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:51:44,177 epoch 1 - iter 117/138 - loss 1.48745793 - time (sec): 6.65 - samples/sec: 2942.78 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:51:44,938 epoch 1 - iter 130/138 - loss 1.38571854 - time (sec): 7.41 - samples/sec: 2913.80 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:51:45,374 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:45,374 EPOCH 1 done: loss 1.3302 - lr: 0.000028 2023-10-17 08:51:45,899 DEV : loss 0.2324945032596588 - f1-score (micro avg) 0.7011 2023-10-17 08:51:45,904 saving best model 2023-10-17 08:51:46,233 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:46,955 epoch 2 - iter 13/138 - loss 0.21424389 - time (sec): 0.72 - samples/sec: 3024.85 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:51:47,650 epoch 2 - iter 26/138 - loss 0.22756117 - time (sec): 1.42 - samples/sec: 3027.07 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:51:48,390 epoch 2 - iter 39/138 - loss 0.23316669 - time (sec): 2.16 - samples/sec: 2965.40 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:51:49,141 epoch 2 - iter 52/138 - loss 0.23340331 - time (sec): 2.91 - samples/sec: 3000.54 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:51:49,861 epoch 2 - iter 65/138 - loss 0.21923938 - time (sec): 3.63 - samples/sec: 2966.67 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:51:50,624 epoch 2 - iter 78/138 - loss 0.22163464 - time (sec): 4.39 - samples/sec: 2982.13 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:51:51,392 epoch 2 - iter 91/138 - loss 0.21586224 - time (sec): 5.16 - samples/sec: 2957.68 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:51:52,184 epoch 2 - iter 104/138 - loss 0.21695390 - time (sec): 5.95 - samples/sec: 2968.90 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:51:52,911 epoch 2 - iter 117/138 - loss 0.20838380 - time (sec): 6.68 - samples/sec: 2954.38 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:51:53,647 epoch 2 - iter 130/138 - loss 0.19758569 - time (sec): 7.41 - samples/sec: 2923.15 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:51:54,093 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:54,094 EPOCH 2 done: loss 0.1986 - lr: 0.000027 2023-10-17 08:51:54,727 DEV : loss 0.15399914979934692 - f1-score (micro avg) 0.8096 2023-10-17 08:51:54,733 saving best model 2023-10-17 08:51:55,180 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:55,898 epoch 3 - iter 13/138 - loss 0.13859865 - time (sec): 0.72 - samples/sec: 2819.16 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:51:56,588 epoch 3 - iter 26/138 - loss 0.11935752 - time (sec): 1.41 - samples/sec: 2702.56 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:51:57,289 epoch 3 - iter 39/138 - loss 0.11105883 - time (sec): 2.11 - samples/sec: 2811.17 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:51:58,089 epoch 3 - iter 52/138 - loss 0.11097839 - time (sec): 2.91 - samples/sec: 2888.44 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:51:58,787 epoch 3 - iter 65/138 - loss 0.11050263 - time (sec): 3.61 - samples/sec: 2863.20 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:51:59,559 epoch 3 - iter 78/138 - loss 0.11063464 - time (sec): 4.38 - samples/sec: 2876.69 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:52:00,281 epoch 3 - iter 91/138 - loss 0.10542292 - time (sec): 5.10 - samples/sec: 2881.93 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:52:01,011 epoch 3 - iter 104/138 - loss 0.10730679 - time (sec): 5.83 - samples/sec: 2901.46 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:52:01,772 epoch 3 - iter 117/138 - loss 0.10616174 - time (sec): 6.59 - samples/sec: 2907.98 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:52:02,515 epoch 3 - iter 130/138 - loss 0.11189608 - time (sec): 7.33 - samples/sec: 2927.18 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:52:02,989 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:02,989 EPOCH 3 done: loss 0.1121 - lr: 0.000024 2023-10-17 08:52:03,634 DEV : loss 0.1457735151052475 - f1-score (micro avg) 0.8237 2023-10-17 08:52:03,639 saving best model 2023-10-17 08:52:04,225 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:04,943 epoch 4 - iter 13/138 - loss 0.07250891 - time (sec): 0.71 - samples/sec: 2875.41 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:52:05,662 epoch 4 - iter 26/138 - loss 0.05856081 - time (sec): 1.43 - samples/sec: 2885.65 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:52:06,373 epoch 4 - iter 39/138 - loss 0.08637693 - time (sec): 2.14 - samples/sec: 2924.58 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:52:07,105 epoch 4 - iter 52/138 - loss 0.09432321 - time (sec): 2.87 - samples/sec: 2937.09 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:52:07,852 epoch 4 - iter 65/138 - loss 0.08795622 - time (sec): 3.62 - samples/sec: 2896.40 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:52:08,599 epoch 4 - iter 78/138 - loss 0.08387789 - time (sec): 4.36 - samples/sec: 2915.81 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:52:09,357 epoch 4 - iter 91/138 - loss 0.08140241 - time (sec): 5.12 - samples/sec: 2901.32 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:52:10,110 epoch 4 - iter 104/138 - loss 0.08215175 - time (sec): 5.88 - samples/sec: 2918.84 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:52:10,855 epoch 4 - iter 117/138 - loss 0.08122231 - time (sec): 6.62 - samples/sec: 2922.65 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:52:11,598 epoch 4 - iter 130/138 - loss 0.07819524 - time (sec): 7.36 - samples/sec: 2917.79 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:52:12,067 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:12,067 EPOCH 4 done: loss 0.0785 - lr: 0.000020 2023-10-17 08:52:12,710 DEV : loss 0.15197761356830597 - f1-score (micro avg) 0.8523 2023-10-17 08:52:12,715 saving best model 2023-10-17 08:52:13,177 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:13,928 epoch 5 - iter 13/138 - loss 0.10698090 - time (sec): 0.75 - samples/sec: 3091.96 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:52:14,650 epoch 5 - iter 26/138 - loss 0.06924964 - time (sec): 1.47 - samples/sec: 3031.03 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:52:15,371 epoch 5 - iter 39/138 - loss 0.06118268 - time (sec): 2.19 - samples/sec: 2922.28 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:52:16,070 epoch 5 - iter 52/138 - loss 0.06382275 - time (sec): 2.89 - samples/sec: 2879.01 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:52:16,787 epoch 5 - iter 65/138 - loss 0.06943541 - time (sec): 3.60 - samples/sec: 2902.42 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:52:17,536 epoch 5 - iter 78/138 - loss 0.06700827 - time (sec): 4.35 - samples/sec: 2902.83 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:52:18,309 epoch 5 - iter 91/138 - loss 0.06373723 - time (sec): 5.13 - samples/sec: 2886.15 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:52:19,072 epoch 5 - iter 104/138 - loss 0.06032136 - time (sec): 5.89 - samples/sec: 2911.29 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:52:19,800 epoch 5 - iter 117/138 - loss 0.05727143 - time (sec): 6.62 - samples/sec: 2921.75 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:52:20,566 epoch 5 - iter 130/138 - loss 0.05646142 - time (sec): 7.38 - samples/sec: 2916.80 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:52:21,018 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:21,018 EPOCH 5 done: loss 0.0545 - lr: 0.000017 2023-10-17 08:52:21,658 DEV : loss 0.14645631611347198 - f1-score (micro avg) 0.848 2023-10-17 08:52:21,663 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:22,350 epoch 6 - iter 13/138 - loss 0.05440593 - time (sec): 0.69 - samples/sec: 2805.84 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:52:23,071 epoch 6 - iter 26/138 - loss 0.05574596 - time (sec): 1.41 - samples/sec: 2864.50 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:52:23,823 epoch 6 - iter 39/138 - loss 0.05112562 - time (sec): 2.16 - samples/sec: 2854.24 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:52:24,559 epoch 6 - iter 52/138 - loss 0.04013845 - time (sec): 2.90 - samples/sec: 2855.74 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:52:25,292 epoch 6 - iter 65/138 - loss 0.04848526 - time (sec): 3.63 - samples/sec: 2911.92 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:52:26,046 epoch 6 - iter 78/138 - loss 0.04253481 - time (sec): 4.38 - samples/sec: 2903.81 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:52:26,786 epoch 6 - iter 91/138 - loss 0.04241939 - time (sec): 5.12 - samples/sec: 2888.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:52:27,558 epoch 6 - iter 104/138 - loss 0.04347683 - time (sec): 5.89 - samples/sec: 2881.26 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:52:28,385 epoch 6 - iter 117/138 - loss 0.04842467 - time (sec): 6.72 - samples/sec: 2879.03 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:52:29,146 epoch 6 - iter 130/138 - loss 0.04731818 - time (sec): 7.48 - samples/sec: 2889.82 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:52:29,569 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:29,569 EPOCH 6 done: loss 0.0458 - lr: 0.000014 2023-10-17 08:52:30,205 DEV : loss 0.16254711151123047 - f1-score (micro avg) 0.8558 2023-10-17 08:52:30,210 saving best model 2023-10-17 08:52:30,641 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:31,406 epoch 7 - iter 13/138 - loss 0.07892644 - time (sec): 0.76 - samples/sec: 2639.24 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:52:32,195 epoch 7 - iter 26/138 - loss 0.05235230 - time (sec): 1.55 - samples/sec: 2701.65 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:52:32,979 epoch 7 - iter 39/138 - loss 0.06129275 - time (sec): 2.34 - samples/sec: 2785.00 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:52:33,740 epoch 7 - iter 52/138 - loss 0.05321401 - time (sec): 3.10 - samples/sec: 2794.96 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:52:34,503 epoch 7 - iter 65/138 - loss 0.04496211 - time (sec): 3.86 - samples/sec: 2770.10 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:52:35,283 epoch 7 - iter 78/138 - loss 0.04023974 - time (sec): 4.64 - samples/sec: 2780.31 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:52:36,115 epoch 7 - iter 91/138 - loss 0.03796743 - time (sec): 5.47 - samples/sec: 2769.56 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:52:36,890 epoch 7 - iter 104/138 - loss 0.03837807 - time (sec): 6.25 - samples/sec: 2781.17 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:52:37,680 epoch 7 - iter 117/138 - loss 0.03792803 - time (sec): 7.04 - samples/sec: 2751.54 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:52:38,410 epoch 7 - iter 130/138 - loss 0.03935759 - time (sec): 7.77 - samples/sec: 2767.90 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:52:38,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:38,868 EPOCH 7 done: loss 0.0379 - lr: 0.000010 2023-10-17 08:52:39,506 DEV : loss 0.17608144879341125 - f1-score (micro avg) 0.8704 2023-10-17 08:52:39,511 saving best model 2023-10-17 08:52:39,941 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:40,779 epoch 8 - iter 13/138 - loss 0.01825711 - time (sec): 0.83 - samples/sec: 2806.12 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:52:41,507 epoch 8 - iter 26/138 - loss 0.03196580 - time (sec): 1.56 - samples/sec: 2904.24 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:52:42,224 epoch 8 - iter 39/138 - loss 0.04011578 - time (sec): 2.28 - samples/sec: 2845.84 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:52:42,996 epoch 8 - iter 52/138 - loss 0.03238949 - time (sec): 3.05 - samples/sec: 2840.52 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:52:43,812 epoch 8 - iter 65/138 - loss 0.02854563 - time (sec): 3.87 - samples/sec: 2869.46 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:52:44,533 epoch 8 - iter 78/138 - loss 0.02841223 - time (sec): 4.59 - samples/sec: 2870.75 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:52:45,307 epoch 8 - iter 91/138 - loss 0.02780310 - time (sec): 5.36 - samples/sec: 2870.88 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:52:46,078 epoch 8 - iter 104/138 - loss 0.03657203 - time (sec): 6.13 - samples/sec: 2871.26 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:52:46,821 epoch 8 - iter 117/138 - loss 0.03573664 - time (sec): 6.88 - samples/sec: 2854.26 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:52:47,589 epoch 8 - iter 130/138 - loss 0.03524076 - time (sec): 7.64 - samples/sec: 2837.65 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:52:48,015 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:48,015 EPOCH 8 done: loss 0.0356 - lr: 0.000007 2023-10-17 08:52:48,651 DEV : loss 0.17216496169567108 - f1-score (micro avg) 0.8785 2023-10-17 08:52:48,656 saving best model 2023-10-17 08:52:49,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:49,878 epoch 9 - iter 13/138 - loss 0.03625675 - time (sec): 0.79 - samples/sec: 2875.71 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:52:50,651 epoch 9 - iter 26/138 - loss 0.02996850 - time (sec): 1.56 - samples/sec: 2944.93 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:52:51,378 epoch 9 - iter 39/138 - loss 0.02784090 - time (sec): 2.29 - samples/sec: 2894.56 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:52:52,108 epoch 9 - iter 52/138 - loss 0.02769365 - time (sec): 3.02 - samples/sec: 2885.59 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:52:52,802 epoch 9 - iter 65/138 - loss 0.02536830 - time (sec): 3.72 - samples/sec: 2833.60 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:52:53,534 epoch 9 - iter 78/138 - loss 0.03015985 - time (sec): 4.45 - samples/sec: 2891.31 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:52:54,246 epoch 9 - iter 91/138 - loss 0.03103286 - time (sec): 5.16 - samples/sec: 2900.77 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:52:54,973 epoch 9 - iter 104/138 - loss 0.02804237 - time (sec): 5.89 - samples/sec: 2889.97 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:52:55,721 epoch 9 - iter 117/138 - loss 0.02708466 - time (sec): 6.63 - samples/sec: 2888.88 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:52:56,449 epoch 9 - iter 130/138 - loss 0.02911539 - time (sec): 7.36 - samples/sec: 2905.42 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:52:56,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:56,934 EPOCH 9 done: loss 0.0287 - lr: 0.000004 2023-10-17 08:52:57,566 DEV : loss 0.18820539116859436 - f1-score (micro avg) 0.8774 2023-10-17 08:52:57,571 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:52:58,309 epoch 10 - iter 13/138 - loss 0.02028127 - time (sec): 0.74 - samples/sec: 2790.06 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:52:59,079 epoch 10 - iter 26/138 - loss 0.02008708 - time (sec): 1.51 - samples/sec: 2949.04 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:52:59,792 epoch 10 - iter 39/138 - loss 0.02027624 - time (sec): 2.22 - samples/sec: 2955.32 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:53:00,485 epoch 10 - iter 52/138 - loss 0.02362562 - time (sec): 2.91 - samples/sec: 2863.04 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:53:01,206 epoch 10 - iter 65/138 - loss 0.02127404 - time (sec): 3.63 - samples/sec: 2896.01 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:53:01,956 epoch 10 - iter 78/138 - loss 0.02290557 - time (sec): 4.38 - samples/sec: 2866.13 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:53:02,668 epoch 10 - iter 91/138 - loss 0.02094274 - time (sec): 5.10 - samples/sec: 2895.83 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:53:03,464 epoch 10 - iter 104/138 - loss 0.02606831 - time (sec): 5.89 - samples/sec: 2906.77 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:53:04,263 epoch 10 - iter 117/138 - loss 0.02606882 - time (sec): 6.69 - samples/sec: 2892.23 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:53:05,042 epoch 10 - iter 130/138 - loss 0.02357499 - time (sec): 7.47 - samples/sec: 2883.34 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:53:05,512 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:05,513 EPOCH 10 done: loss 0.0241 - lr: 0.000000 2023-10-17 08:53:06,147 DEV : loss 0.18060584366321564 - f1-score (micro avg) 0.8753 2023-10-17 08:53:06,517 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:53:06,518 Loading model from best epoch ... 2023-10-17 08:53:07,866 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:53:08,674 Results: - F-score (micro) 0.8924 - F-score (macro) 0.6674 - Accuracy 0.8173 By class: precision recall f1-score support scope 0.8966 0.8864 0.8914 176 pers 0.9593 0.9219 0.9402 128 work 0.8025 0.8784 0.8387 74 object 0.0000 0.0000 0.0000 2 loc 1.0000 0.5000 0.6667 2 micro avg 0.8947 0.8901 0.8924 382 macro avg 0.7317 0.6373 0.6674 382 weighted avg 0.8952 0.8901 0.8917 382 2023-10-17 08:53:08,674 ----------------------------------------------------------------------------------------------------