File size: 5,176 Bytes
42fc80b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
library_name: transformers
license: apache-2.0
base_model: google/mt5-small
tags:
- summarization
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-small-synthetic-data-plus-translated-bs32
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mt5-small-synthetic-data-plus-translated-bs32

This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8369
- Rouge1: 0.6206
- Rouge2: 0.4859
- Rougel: 0.5972
- Rougelsum: 0.5979

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 40

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 19.4785       | 1.0   | 38   | 11.5404         | 0.0055 | 0.0008 | 0.0051 | 0.0051    |
| 11.9977       | 2.0   | 76   | 6.4079          | 0.0101 | 0.0015 | 0.0089 | 0.0094    |
| 7.5027        | 3.0   | 114  | 3.0626          | 0.0542 | 0.0093 | 0.0482 | 0.0487    |
| 4.8939        | 4.0   | 152  | 2.2496          | 0.0492 | 0.0182 | 0.0429 | 0.0437    |
| 3.64          | 5.0   | 190  | 1.7984          | 0.1870 | 0.0826 | 0.1598 | 0.1601    |
| 2.8662        | 6.0   | 228  | 1.4518          | 0.1852 | 0.0916 | 0.1653 | 0.1659    |
| 2.4493        | 7.0   | 266  | 1.3124          | 0.4183 | 0.2586 | 0.4014 | 0.4026    |
| 2.1362        | 8.0   | 304  | 1.2444          | 0.4386 | 0.2716 | 0.4176 | 0.4196    |
| 1.9923        | 9.0   | 342  | 1.1876          | 0.4587 | 0.3034 | 0.4387 | 0.4404    |
| 1.8438        | 10.0  | 380  | 1.1486          | 0.5198 | 0.3637 | 0.4979 | 0.4988    |
| 1.7212        | 11.0  | 418  | 1.1031          | 0.5402 | 0.3848 | 0.5160 | 0.5169    |
| 1.6315        | 12.0  | 456  | 1.0707          | 0.5556 | 0.3999 | 0.5325 | 0.5341    |
| 1.5623        | 13.0  | 494  | 1.0437          | 0.5808 | 0.4309 | 0.5583 | 0.5593    |
| 1.5269        | 14.0  | 532  | 1.0188          | 0.5986 | 0.4540 | 0.5773 | 0.5772    |
| 1.4668        | 15.0  | 570  | 0.9982          | 0.5922 | 0.4511 | 0.5731 | 0.5737    |
| 1.4357        | 16.0  | 608  | 0.9777          | 0.5965 | 0.4549 | 0.5768 | 0.5773    |
| 1.3684        | 17.0  | 646  | 0.9623          | 0.6123 | 0.4722 | 0.5901 | 0.5907    |
| 1.3675        | 18.0  | 684  | 0.9461          | 0.6135 | 0.4771 | 0.5915 | 0.5919    |
| 1.3285        | 19.0  | 722  | 0.9324          | 0.6150 | 0.4754 | 0.5916 | 0.5918    |
| 1.288         | 20.0  | 760  | 0.9271          | 0.6179 | 0.4803 | 0.5964 | 0.5968    |
| 1.2529        | 21.0  | 798  | 0.9129          | 0.6156 | 0.4789 | 0.5939 | 0.5940    |
| 1.2216        | 22.0  | 836  | 0.9017          | 0.6163 | 0.4817 | 0.5941 | 0.5941    |
| 1.2322        | 23.0  | 874  | 0.8948          | 0.6208 | 0.4839 | 0.5985 | 0.5986    |
| 1.2062        | 24.0  | 912  | 0.8838          | 0.6139 | 0.4778 | 0.5904 | 0.5912    |
| 1.1642        | 25.0  | 950  | 0.8761          | 0.6150 | 0.4818 | 0.5939 | 0.5951    |
| 1.1699        | 26.0  | 988  | 0.8759          | 0.6152 | 0.4794 | 0.5929 | 0.5932    |
| 1.1428        | 27.0  | 1026 | 0.8662          | 0.6158 | 0.4806 | 0.5935 | 0.5946    |
| 1.195         | 28.0  | 1064 | 0.8609          | 0.6126 | 0.4758 | 0.5898 | 0.5908    |
| 1.1619        | 29.0  | 1102 | 0.8568          | 0.6152 | 0.4776 | 0.5924 | 0.5936    |
| 1.1172        | 30.0  | 1140 | 0.8548          | 0.6181 | 0.4788 | 0.5951 | 0.5964    |
| 1.1141        | 31.0  | 1178 | 0.8526          | 0.6148 | 0.4766 | 0.5904 | 0.5914    |
| 1.1176        | 32.0  | 1216 | 0.8488          | 0.6201 | 0.4834 | 0.5963 | 0.5972    |
| 1.0959        | 33.0  | 1254 | 0.8475          | 0.6225 | 0.4847 | 0.5983 | 0.5993    |
| 1.0954        | 34.0  | 1292 | 0.8437          | 0.6220 | 0.4859 | 0.5987 | 0.5986    |
| 1.0844        | 35.0  | 1330 | 0.8420          | 0.6206 | 0.4851 | 0.5969 | 0.5974    |
| 1.1041        | 36.0  | 1368 | 0.8398          | 0.6222 | 0.4865 | 0.5991 | 0.5992    |
| 1.0736        | 37.0  | 1406 | 0.8386          | 0.6225 | 0.4867 | 0.5991 | 0.6001    |
| 1.0816        | 38.0  | 1444 | 0.8376          | 0.6229 | 0.4871 | 0.5994 | 0.6001    |
| 1.0537        | 39.0  | 1482 | 0.8372          | 0.6242 | 0.4876 | 0.6004 | 0.6013    |
| 1.092         | 40.0  | 1520 | 0.8369          | 0.6206 | 0.4859 | 0.5972 | 0.5979    |


### Framework versions

- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0