Strange18 commited on
Commit
1dd7ece
·
verified ·
1 Parent(s): 9bf0727

trained on new dataset for 2 epochs on 2.8k dataset

Browse files
Files changed (4) hide show
  1. README.md +18 -19
  2. model.safetensors +1 -1
  3. tokenizer.json +2 -2
  4. training_args.bin +1 -1
README.md CHANGED
@@ -19,14 +19,14 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [Strange18/results](https://huggingface.co/Strange18/results) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.1168
23
- - Sacrebleu: 26.3209
24
- - Bleu: 0.2632
25
- - Rouge1: 0.5660
26
- - Rouge2: 0.3505
27
- - Rougel: 0.5283
28
- - Rougelsum: 0.5264
29
- - Ter: 60.5362
30
 
31
  ## Model description
32
 
@@ -46,25 +46,24 @@ More information needed
46
 
47
  The following hyperparameters were used during training:
48
  - learning_rate: 1e-05
49
- - train_batch_size: 4
50
- - eval_batch_size: 4
51
  - seed: 42
52
- - gradient_accumulation_steps: 2
53
  - total_train_batch_size: 8
54
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
55
  - lr_scheduler_type: linear
56
- - lr_scheduler_warmup_steps: 50
57
- - num_epochs: 4
58
  - mixed_precision_training: Native AMP
59
 
60
  ### Training results
61
 
62
- | Training Loss | Epoch | Step | Validation Loss | Sacrebleu | Bleu | Rouge1 | Rouge2 | Rougel | Rougelsum | Ter |
63
- |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:------:|:------:|:---------:|:-------:|
64
- | 0.1234 | 1.0 | 336 | 0.1226 | 25.7182 | 0.2572 | 0.5644 | 0.3361 | 0.5288 | 0.5277 | 59.4638 |
65
- | 0.0804 | 2.0 | 672 | 0.1152 | 24.4963 | 0.2450 | 0.5502 | 0.3245 | 0.5151 | 0.5134 | 61.6086 |
66
- | 0.0668 | 3.0 | 1008 | 0.1141 | 25.1601 | 0.2516 | 0.5680 | 0.3433 | 0.5346 | 0.5322 | 59.8391 |
67
- | 0.0547 | 4.0 | 1344 | 0.1168 | 26.3209 | 0.2632 | 0.5660 | 0.3505 | 0.5283 | 0.5264 | 60.5362 |
68
 
69
 
70
  ### Framework versions
 
19
 
20
  This model is a fine-tuned version of [Strange18/results](https://huggingface.co/Strange18/results) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.1108
23
+ - Sacrebleu: 28.4098
24
+ - Bleu: 0.2841
25
+ - Rouge1: 0.6157
26
+ - Rouge2: 0.3844
27
+ - Rougel: 0.5828
28
+ - Rougelsum: 0.5826
29
+ - Ter: 53.4048
30
 
31
  ## Model description
32
 
 
46
 
47
  The following hyperparameters were used during training:
48
  - learning_rate: 1e-05
49
+ - train_batch_size: 2
50
+ - eval_batch_size: 2
51
  - seed: 42
52
+ - gradient_accumulation_steps: 4
53
  - total_train_batch_size: 8
54
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
55
  - lr_scheduler_type: linear
56
+ - lr_scheduler_warmup_steps: 700
57
+ - num_epochs: 3
58
  - mixed_precision_training: Native AMP
59
 
60
  ### Training results
61
 
62
+ | Training Loss | Epoch | Step | Validation Loss | Sacrebleu | Bleu | Rouge1 | Rouge2 | Rougel | Rougelsum | Ter |
63
+ |:-------------:|:------:|:----:|:---------------:|:---------:|:------:|:------:|:------:|:------:|:---------:|:-------:|
64
+ | 0.2067 | 0.9978 | 335 | 0.1662 | 16.3668 | 0.1637 | 0.4966 | 0.2466 | 0.4518 | 0.4506 | 68.4718 |
65
+ | 0.1603 | 1.9993 | 671 | 0.1570 | 19.9178 | 0.1992 | 0.5186 | 0.2854 | 0.4882 | 0.4874 | 63.0027 |
66
+ | 0.1169 | 2.9948 | 1005 | 0.1108 | 28.4098 | 0.2841 | 0.6157 | 0.3844 | 0.5828 | 0.5826 | 53.4048 |
 
67
 
68
 
69
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9dc59fe5866921381b36763657bd393114ff766793410f5a57b05c0f495f917b
3
  size 2444578688
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2e8ef8afc2c8e8c14a003a30b03d2e9f35aef109b2cafb7c4ad8cb2ac055ba3
3
  size 2444578688
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:942a8de2c4c936e740ea50e074296519b5ceccf29c86b4eaf5f4fc93377e3b09
3
- size 17094835
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ac4bfeac2fcd7cbc788d5a8d708aea33f37f05b4898b0c23651da928afcfa72
3
+ size 17094570
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8847ba8b25dd5c6877e518755f47317024a78293ddc31f600b03375d757a4350
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50a434fc6367b5dfee6cea33ddd0ee0000ddf4739bf62a83674e602713332960
3
  size 5432