flan-t5-rouge-squad-qg-testb
This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4778
- Rouge1: 0.3588
- Rouge2: 0.1173
- Rougel: 0.3192
- Rougelsum: 0.3430
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 80
- eval_batch_size: 80
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 320
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 160
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
66.9315 | 1.0 | 3 | 33.8039 | 0.0551 | 0.0150 | 0.0500 | 0.0511 |
59.6739 | 2.0 | 6 | 29.7314 | 0.0864 | 0.0288 | 0.0776 | 0.0784 |
54.5656 | 3.0 | 9 | 26.4939 | 0.0875 | 0.0310 | 0.0801 | 0.0805 |
48.9503 | 4.0 | 12 | 23.5018 | 0.0705 | 0.0230 | 0.0658 | 0.0661 |
44.6287 | 5.0 | 15 | 20.5282 | 0.0584 | 0.0179 | 0.0536 | 0.0541 |
39.5637 | 6.0 | 18 | 17.2638 | 0.0729 | 0.0340 | 0.0726 | 0.0724 |
35.0541 | 7.0 | 21 | 13.6554 | 0.0856 | 0.0389 | 0.0853 | 0.0853 |
29.799 | 8.0 | 24 | 10.2408 | 0.0821 | 0.0411 | 0.0816 | 0.0815 |
24.9435 | 9.0 | 27 | 8.2909 | 0.0705 | 0.0387 | 0.0707 | 0.0703 |
21.8868 | 10.0 | 30 | 7.8624 | 0.0764 | 0.0412 | 0.0765 | 0.0765 |
19.6007 | 11.0 | 33 | 7.6396 | 0.0789 | 0.0421 | 0.0791 | 0.0788 |
17.6708 | 12.0 | 36 | 7.3917 | 0.0786 | 0.0402 | 0.0757 | 0.0768 |
16.0957 | 13.0 | 39 | 7.0298 | 0.0874 | 0.0411 | 0.0845 | 0.0858 |
14.8993 | 14.0 | 42 | 6.3918 | 0.0933 | 0.0416 | 0.0885 | 0.0905 |
13.8388 | 15.0 | 45 | 5.6045 | 0.0940 | 0.0442 | 0.0889 | 0.0910 |
12.7787 | 16.0 | 48 | 5.0419 | 0.0962 | 0.0445 | 0.0899 | 0.0927 |
11.9806 | 17.0 | 51 | 4.7749 | 0.0887 | 0.0399 | 0.0822 | 0.0850 |
11.2918 | 18.0 | 54 | 4.6137 | 0.0965 | 0.0459 | 0.0910 | 0.0943 |
10.7497 | 19.0 | 57 | 4.4894 | 0.0952 | 0.0427 | 0.0875 | 0.0913 |
10.3252 | 20.0 | 60 | 4.3848 | 0.1051 | 0.0553 | 0.0968 | 0.1011 |
9.8977 | 21.0 | 63 | 4.2909 | 0.1068 | 0.0520 | 0.0945 | 0.1002 |
9.6954 | 22.0 | 66 | 4.2021 | 0.0953 | 0.0412 | 0.0822 | 0.0884 |
9.4192 | 23.0 | 69 | 4.1174 | 0.1043 | 0.0404 | 0.0887 | 0.0947 |
9.147 | 24.0 | 72 | 4.0337 | 0.1089 | 0.0466 | 0.0961 | 0.1016 |
8.8825 | 25.0 | 75 | 3.9481 | 0.1026 | 0.0436 | 0.0911 | 0.0958 |
8.6989 | 26.0 | 78 | 3.8586 | 0.1082 | 0.0477 | 0.0955 | 0.1004 |
8.4876 | 27.0 | 81 | 3.7634 | 0.1088 | 0.0479 | 0.0960 | 0.1008 |
8.3068 | 28.0 | 84 | 3.6638 | 0.1119 | 0.0432 | 0.0943 | 0.1017 |
8.1205 | 29.0 | 87 | 3.5645 | 0.1067 | 0.0409 | 0.0900 | 0.0973 |
7.985 | 30.0 | 90 | 3.4753 | 0.1205 | 0.0425 | 0.0991 | 0.1075 |
7.8097 | 31.0 | 93 | 3.4016 | 0.1024 | 0.0422 | 0.0855 | 0.0925 |
7.6231 | 32.0 | 96 | 3.3417 | 0.1108 | 0.0415 | 0.0910 | 0.0985 |
7.5175 | 33.0 | 99 | 3.2915 | 0.0932 | 0.0333 | 0.0757 | 0.0833 |
7.3729 | 34.0 | 102 | 3.2466 | 0.0965 | 0.0335 | 0.0779 | 0.0864 |
7.1983 | 35.0 | 105 | 3.2033 | 0.0967 | 0.0333 | 0.0781 | 0.0859 |
7.0976 | 36.0 | 108 | 3.1594 | 0.0861 | 0.0293 | 0.0711 | 0.0760 |
6.9906 | 37.0 | 111 | 3.1135 | 0.0861 | 0.0293 | 0.0711 | 0.0760 |
6.8668 | 38.0 | 114 | 3.0645 | 0.0910 | 0.0304 | 0.0762 | 0.0812 |
6.7592 | 39.0 | 117 | 3.0116 | 0.0985 | 0.0314 | 0.0823 | 0.0873 |
6.6407 | 40.0 | 120 | 2.9563 | 0.1002 | 0.0304 | 0.0837 | 0.0887 |
6.5709 | 41.0 | 123 | 2.9008 | 0.0981 | 0.0295 | 0.0831 | 0.0871 |
6.4844 | 42.0 | 126 | 2.8449 | 0.1117 | 0.0370 | 0.0939 | 0.0979 |
6.3734 | 43.0 | 129 | 2.7906 | 0.1034 | 0.0351 | 0.0864 | 0.0907 |
6.2984 | 44.0 | 132 | 2.7383 | 0.0929 | 0.0322 | 0.0795 | 0.0826 |
6.203 | 45.0 | 135 | 2.6884 | 0.1004 | 0.0350 | 0.0840 | 0.0892 |
6.1458 | 46.0 | 138 | 2.6409 | 0.1054 | 0.0354 | 0.0885 | 0.0935 |
6.0371 | 47.0 | 141 | 2.5963 | 0.0916 | 0.0315 | 0.0783 | 0.0831 |
5.9271 | 48.0 | 144 | 2.5539 | 0.0775 | 0.0312 | 0.0657 | 0.0709 |
5.9151 | 49.0 | 147 | 2.5119 | 0.0842 | 0.0323 | 0.0717 | 0.0772 |
5.7851 | 50.0 | 150 | 2.4719 | 0.0893 | 0.0312 | 0.0726 | 0.0786 |
5.6981 | 51.0 | 153 | 2.4329 | 0.0789 | 0.0245 | 0.0634 | 0.0694 |
5.6468 | 52.0 | 156 | 2.3949 | 0.0868 | 0.0273 | 0.0677 | 0.0751 |
5.5874 | 53.0 | 159 | 2.3584 | 0.0868 | 0.0273 | 0.0677 | 0.0751 |
5.5238 | 54.0 | 162 | 2.3235 | 0.1064 | 0.0309 | 0.0853 | 0.0941 |
5.4892 | 55.0 | 165 | 2.2884 | 0.1061 | 0.0313 | 0.0851 | 0.0950 |
5.3768 | 56.0 | 168 | 2.2536 | 0.1061 | 0.0313 | 0.0851 | 0.0950 |
5.2819 | 57.0 | 171 | 2.2191 | 0.1061 | 0.0313 | 0.0851 | 0.0950 |
5.2262 | 58.0 | 174 | 2.1852 | 0.1428 | 0.0442 | 0.1169 | 0.1310 |
5.2138 | 59.0 | 177 | 2.1516 | 0.1436 | 0.0406 | 0.1190 | 0.1322 |
5.1761 | 60.0 | 180 | 2.1187 | 0.1621 | 0.0517 | 0.1353 | 0.1491 |
5.1171 | 61.0 | 183 | 2.0865 | 0.1798 | 0.0564 | 0.1551 | 0.1679 |
5.0242 | 62.0 | 186 | 2.0561 | 0.1858 | 0.0582 | 0.1624 | 0.1748 |
4.9751 | 63.0 | 189 | 2.0281 | 0.2180 | 0.0703 | 0.1937 | 0.2056 |
4.9801 | 64.0 | 192 | 2.0015 | 0.2425 | 0.0763 | 0.2178 | 0.2297 |
4.8846 | 65.0 | 195 | 1.9753 | 0.2721 | 0.0891 | 0.2489 | 0.2600 |
4.8342 | 66.0 | 198 | 1.9501 | 0.2923 | 0.0940 | 0.2652 | 0.2783 |
4.8251 | 67.0 | 201 | 1.9253 | 0.3031 | 0.0969 | 0.2759 | 0.2883 |
4.7383 | 68.0 | 204 | 1.9017 | 0.3175 | 0.1024 | 0.2882 | 0.3027 |
4.7067 | 69.0 | 207 | 1.8781 | 0.3314 | 0.1042 | 0.2996 | 0.3144 |
4.6585 | 70.0 | 210 | 1.8545 | 0.3303 | 0.1029 | 0.2995 | 0.3138 |
4.5942 | 71.0 | 213 | 1.8311 | 0.3281 | 0.1015 | 0.2993 | 0.3130 |
4.554 | 72.0 | 216 | 1.8087 | 0.3281 | 0.1015 | 0.2993 | 0.3130 |
4.5399 | 73.0 | 219 | 1.7882 | 0.3281 | 0.1015 | 0.2993 | 0.3130 |
4.5289 | 74.0 | 222 | 1.7686 | 0.3300 | 0.1048 | 0.3018 | 0.3164 |
4.4602 | 75.0 | 225 | 1.7503 | 0.3301 | 0.1053 | 0.3021 | 0.3164 |
4.4387 | 76.0 | 228 | 1.7330 | 0.3301 | 0.1053 | 0.3021 | 0.3164 |
4.391 | 77.0 | 231 | 1.7164 | 0.3288 | 0.1045 | 0.3012 | 0.3158 |
4.3463 | 78.0 | 234 | 1.7008 | 0.3401 | 0.1067 | 0.3106 | 0.3253 |
4.2849 | 79.0 | 237 | 1.6855 | 0.3467 | 0.1082 | 0.3139 | 0.3309 |
4.3224 | 80.0 | 240 | 1.6701 | 0.3467 | 0.1082 | 0.3139 | 0.3309 |
4.2964 | 81.0 | 243 | 1.6547 | 0.3467 | 0.1082 | 0.3139 | 0.3309 |
4.2256 | 82.0 | 246 | 1.6403 | 0.3467 | 0.1082 | 0.3139 | 0.3309 |
4.2265 | 83.0 | 249 | 1.6268 | 0.3474 | 0.1092 | 0.3144 | 0.3316 |
4.1872 | 84.0 | 252 | 1.6139 | 0.3468 | 0.1092 | 0.3140 | 0.3310 |
4.1531 | 85.0 | 255 | 1.6014 | 0.3529 | 0.1152 | 0.3158 | 0.3370 |
4.1586 | 86.0 | 258 | 1.5896 | 0.3529 | 0.1152 | 0.3158 | 0.3370 |
4.1404 | 87.0 | 261 | 1.5787 | 0.3529 | 0.1152 | 0.3158 | 0.3370 |
4.0821 | 88.0 | 264 | 1.5687 | 0.3529 | 0.1152 | 0.3158 | 0.3370 |
4.1139 | 89.0 | 267 | 1.5593 | 0.3557 | 0.1174 | 0.3174 | 0.3401 |
4.0456 | 90.0 | 270 | 1.5503 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
4.0538 | 91.0 | 273 | 1.5418 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
4.072 | 92.0 | 276 | 1.5339 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
3.9976 | 93.0 | 279 | 1.5264 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
4.0005 | 94.0 | 282 | 1.5194 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
3.9937 | 95.0 | 285 | 1.5132 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
3.9968 | 96.0 | 288 | 1.5077 | 0.3587 | 0.1180 | 0.3201 | 0.3426 |
3.9667 | 97.0 | 291 | 1.5027 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9796 | 98.0 | 294 | 1.4982 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9699 | 99.0 | 297 | 1.4941 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.8984 | 100.0 | 300 | 1.4903 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.898 | 101.0 | 303 | 1.4870 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9476 | 102.0 | 306 | 1.4842 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9353 | 103.0 | 309 | 1.4818 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9377 | 104.0 | 312 | 1.4800 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9396 | 105.0 | 315 | 1.4788 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
3.9439 | 106.0 | 318 | 1.4780 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
7.8954 | 106.8 | 320 | 1.4778 | 0.3588 | 0.1173 | 0.3192 | 0.3430 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 114
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for devagonal/flan-t5-rouge-squad-qg-testb
Base model
google/flan-t5-small