trained on new dataset for 2 epochs on 2.8k dataset

Browse files

Files changed (4) hide show

README.md +18 -19
model.safetensors +1 -1
tokenizer.json +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -19,14 +19,14 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [Strange18/results](https://huggingface.co/Strange18/results) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1168
-- Sacrebleu: 26.3209
-- Bleu: 0.2632
-- Rouge1: 0.5660
-- Rouge2: 0.3505
-- Rougel: 0.5283
-- Rougelsum: 0.5264
-- Ter: 60.5362
 ## Model description
@@ -46,25 +46,24 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
-- gradient_accumulation_steps: 2
 - total_train_batch_size: 8
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- num_epochs: 4
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Sacrebleu | Bleu   | Rouge1 | Rouge2 | Rougel | Rougelsum | Ter     |
-|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:------:|:------:|:---------:|:-------:|
-| 0.1234        | 1.0   | 336  | 0.1226          | 25.7182   | 0.2572 | 0.5644 | 0.3361 | 0.5288 | 0.5277    | 59.4638 |
-| 0.0804        | 2.0   | 672  | 0.1152          | 24.4963   | 0.2450 | 0.5502 | 0.3245 | 0.5151 | 0.5134    | 61.6086 |
-| 0.0668        | 3.0   | 1008 | 0.1141          | 25.1601   | 0.2516 | 0.5680 | 0.3433 | 0.5346 | 0.5322    | 59.8391 |
-| 0.0547        | 4.0   | 1344 | 0.1168          | 26.3209   | 0.2632 | 0.5660 | 0.3505 | 0.5283 | 0.5264    | 60.5362 |
 ### Framework versions

 This model is a fine-tuned version of [Strange18/results](https://huggingface.co/Strange18/results) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1108
+- Sacrebleu: 28.4098
+- Bleu: 0.2841
+- Rouge1: 0.6157
+- Rouge2: 0.3844
+- Rougel: 0.5828
+- Rougelsum: 0.5826
+- Ter: 53.4048
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 42
+- gradient_accumulation_steps: 4
 - total_train_batch_size: 8
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 700
+- num_epochs: 3
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Sacrebleu | Bleu   | Rouge1 | Rouge2 | Rougel | Rougelsum | Ter     |
+|:-------------:|:------:|:----:|:---------------:|:---------:|:------:|:------:|:------:|:------:|:---------:|:-------:|
+| 0.2067        | 0.9978 | 335  | 0.1662          | 16.3668   | 0.1637 | 0.4966 | 0.2466 | 0.4518 | 0.4506    | 68.4718 |
+| 0.1603        | 1.9993 | 671  | 0.1570          | 19.9178   | 0.1992 | 0.5186 | 0.2854 | 0.4882 | 0.4874    | 63.0027 |
+| 0.1169        | 2.9948 | 1005 | 0.1108          | 28.4098   | 0.2841 | 0.6157 | 0.3844 | 0.5828 | 0.5826    | 53.4048 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9dc59fe5866921381b36763657bd393114ff766793410f5a57b05c0f495f917b
 size 2444578688

 version https://git-lfs.github.com/spec/v1
+oid sha256:e2e8ef8afc2c8e8c14a003a30b03d2e9f35aef109b2cafb7c4ad8cb2ac055ba3
 size 2444578688

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:942a8de2c4c936e740ea50e074296519b5ceccf29c86b4eaf5f4fc93377e3b09
-size 17094835

 version https://git-lfs.github.com/spec/v1
+oid sha256:3ac4bfeac2fcd7cbc788d5a8d708aea33f37f05b4898b0c23651da928afcfa72
+size 17094570

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8847ba8b25dd5c6877e518755f47317024a78293ddc31f600b03375d757a4350
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:50a434fc6367b5dfee6cea33ddd0ee0000ddf4739bf62a83674e602713332960
 size 5432