flan-t5-rouge-squad-qg-120c
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3871
- Rouge1: 0.2935
- Rouge2: 0.2208
- Rougel: 0.2920
- Rougelsum: 0.2959
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 120
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
34.4713 | 1.0 | 3 | 45.1762 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
36.6825 | 2.0 | 6 | 43.7385 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
35.5249 | 3.0 | 9 | 42.2799 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
32.6024 | 4.0 | 12 | 40.8132 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
30.4243 | 5.0 | 15 | 39.3652 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
31.7429 | 6.0 | 18 | 37.9705 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
30.2926 | 7.0 | 21 | 36.6353 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
28.2353 | 8.0 | 24 | 35.3305 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
28.61 | 9.0 | 27 | 34.0814 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
27.3927 | 10.0 | 30 | 32.8620 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
26.0067 | 11.0 | 33 | 31.7321 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
26.0819 | 12.0 | 36 | 30.6746 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
24.9487 | 13.0 | 39 | 29.6870 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
21.7549 | 14.0 | 42 | 28.7613 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
21.9198 | 15.0 | 45 | 27.9048 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
19.8979 | 16.0 | 48 | 27.0596 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
18.9591 | 17.0 | 51 | 26.2339 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
21.4812 | 18.0 | 54 | 25.3991 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
19.597 | 19.0 | 57 | 24.5224 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
18.9566 | 20.0 | 60 | 23.6202 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
18.3614 | 21.0 | 63 | 22.6460 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
18.2406 | 22.0 | 66 | 21.6603 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
17.5012 | 23.0 | 69 | 20.6620 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
16.3583 | 24.0 | 72 | 19.6178 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
15.3727 | 25.0 | 75 | 18.4996 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
15.8902 | 26.0 | 78 | 17.3469 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
14.0877 | 27.0 | 81 | 16.1392 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
12.5516 | 28.0 | 84 | 14.8773 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
11.8765 | 29.0 | 87 | 13.5345 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
10.3602 | 30.0 | 90 | 12.1325 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
10.3562 | 31.0 | 93 | 10.7131 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
8.8225 | 32.0 | 96 | 9.3161 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
8.2495 | 33.0 | 99 | 7.8755 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
8.217 | 34.0 | 102 | 6.6176 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
7.8925 | 35.0 | 105 | 5.7100 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
6.2977 | 36.0 | 108 | 5.1969 | 0.2046 | 0.1174 | 0.2029 | 0.2057 |
5.9653 | 37.0 | 111 | 4.9328 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
5.1543 | 38.0 | 114 | 4.7818 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
5.4242 | 39.0 | 117 | 4.6803 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
5.2743 | 40.0 | 120 | 4.6037 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.797 | 41.0 | 123 | 4.5463 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.67 | 42.0 | 126 | 4.4981 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.5191 | 43.0 | 129 | 4.4519 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.7642 | 44.0 | 132 | 4.4046 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.5347 | 45.0 | 135 | 4.3578 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.5199 | 46.0 | 138 | 4.3133 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.5551 | 47.0 | 141 | 4.2708 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.5618 | 48.0 | 144 | 4.2303 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.5035 | 49.0 | 147 | 4.1928 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.3732 | 50.0 | 150 | 4.1569 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.475 | 51.0 | 153 | 4.1229 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.4034 | 52.0 | 156 | 4.0880 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.6115 | 53.0 | 159 | 4.0555 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.2614 | 54.0 | 162 | 4.0263 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.2738 | 55.0 | 165 | 3.9959 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.9932 | 56.0 | 168 | 3.9629 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.3816 | 57.0 | 171 | 3.9280 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.279 | 58.0 | 174 | 3.8918 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.1165 | 59.0 | 177 | 3.8563 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.0849 | 60.0 | 180 | 3.8186 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.9023 | 61.0 | 183 | 3.7800 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.1332 | 62.0 | 186 | 3.7370 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.0331 | 63.0 | 189 | 3.6916 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.2037 | 64.0 | 192 | 3.6434 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.802 | 65.0 | 195 | 3.5920 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.0885 | 66.0 | 198 | 3.5371 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.1013 | 67.0 | 201 | 3.4793 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
4.1904 | 68.0 | 204 | 3.4197 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.8735 | 69.0 | 207 | 3.3592 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.8527 | 70.0 | 210 | 3.2945 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.7358 | 71.0 | 213 | 3.2286 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.7301 | 72.0 | 216 | 3.1610 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.9434 | 73.0 | 219 | 3.0955 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.5842 | 74.0 | 222 | 3.0311 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.498 | 75.0 | 225 | 2.9672 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.6927 | 76.0 | 228 | 2.9071 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.7322 | 77.0 | 231 | 2.8534 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.6775 | 78.0 | 234 | 2.8056 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.5348 | 79.0 | 237 | 2.7621 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.5138 | 80.0 | 240 | 2.7239 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.6286 | 81.0 | 243 | 2.6901 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3644 | 82.0 | 246 | 2.6593 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.5279 | 83.0 | 249 | 2.6309 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2994 | 84.0 | 252 | 2.6070 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.604 | 85.0 | 255 | 2.5905 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.5595 | 86.0 | 258 | 2.5804 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3599 | 87.0 | 261 | 2.5780 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.755 | 88.0 | 264 | 2.5787 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.5091 | 89.0 | 267 | 2.5804 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.4362 | 90.0 | 270 | 2.5819 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3453 | 91.0 | 273 | 2.5781 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2409 | 92.0 | 276 | 2.5692 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3512 | 93.0 | 279 | 2.5602 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2764 | 94.0 | 282 | 2.5499 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.1415 | 95.0 | 285 | 2.5409 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2531 | 96.0 | 288 | 2.5323 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.0301 | 97.0 | 291 | 2.5194 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3918 | 98.0 | 294 | 2.5060 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.4306 | 99.0 | 297 | 2.4950 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.6159 | 100.0 | 300 | 2.4899 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2354 | 101.0 | 303 | 2.4884 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2495 | 102.0 | 306 | 2.4827 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.1807 | 103.0 | 309 | 2.4730 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2411 | 104.0 | 312 | 2.4653 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.1972 | 105.0 | 315 | 2.4592 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2739 | 106.0 | 318 | 2.4534 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.0481 | 107.0 | 321 | 2.4477 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.1791 | 108.0 | 324 | 2.4412 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.0665 | 109.0 | 327 | 2.4329 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.4172 | 110.0 | 330 | 2.4248 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3206 | 111.0 | 333 | 2.4179 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2582 | 112.0 | 336 | 2.4125 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
2.9083 | 113.0 | 339 | 2.4077 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
2.7824 | 114.0 | 342 | 2.4041 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.2219 | 115.0 | 345 | 2.4000 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.0009 | 116.0 | 348 | 2.3956 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.1795 | 117.0 | 351 | 2.3917 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.0953 | 118.0 | 354 | 2.3891 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3029 | 119.0 | 357 | 2.3876 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
3.3398 | 120.0 | 360 | 2.3871 | 0.2935 | 0.2208 | 0.2920 | 0.2959 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 114
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for devagonal/flan-t5-rouge-squad-qg-120c
Base model
google/flan-t5-base