flan-t5-rouge-squad-qg-testb

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4778
  • Rouge1: 0.3588
  • Rouge2: 0.1173
  • Rougel: 0.3192
  • Rougelsum: 0.3430

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 80
  • eval_batch_size: 80
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 320
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 160

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
66.9315 1.0 3 33.8039 0.0551 0.0150 0.0500 0.0511
59.6739 2.0 6 29.7314 0.0864 0.0288 0.0776 0.0784
54.5656 3.0 9 26.4939 0.0875 0.0310 0.0801 0.0805
48.9503 4.0 12 23.5018 0.0705 0.0230 0.0658 0.0661
44.6287 5.0 15 20.5282 0.0584 0.0179 0.0536 0.0541
39.5637 6.0 18 17.2638 0.0729 0.0340 0.0726 0.0724
35.0541 7.0 21 13.6554 0.0856 0.0389 0.0853 0.0853
29.799 8.0 24 10.2408 0.0821 0.0411 0.0816 0.0815
24.9435 9.0 27 8.2909 0.0705 0.0387 0.0707 0.0703
21.8868 10.0 30 7.8624 0.0764 0.0412 0.0765 0.0765
19.6007 11.0 33 7.6396 0.0789 0.0421 0.0791 0.0788
17.6708 12.0 36 7.3917 0.0786 0.0402 0.0757 0.0768
16.0957 13.0 39 7.0298 0.0874 0.0411 0.0845 0.0858
14.8993 14.0 42 6.3918 0.0933 0.0416 0.0885 0.0905
13.8388 15.0 45 5.6045 0.0940 0.0442 0.0889 0.0910
12.7787 16.0 48 5.0419 0.0962 0.0445 0.0899 0.0927
11.9806 17.0 51 4.7749 0.0887 0.0399 0.0822 0.0850
11.2918 18.0 54 4.6137 0.0965 0.0459 0.0910 0.0943
10.7497 19.0 57 4.4894 0.0952 0.0427 0.0875 0.0913
10.3252 20.0 60 4.3848 0.1051 0.0553 0.0968 0.1011
9.8977 21.0 63 4.2909 0.1068 0.0520 0.0945 0.1002
9.6954 22.0 66 4.2021 0.0953 0.0412 0.0822 0.0884
9.4192 23.0 69 4.1174 0.1043 0.0404 0.0887 0.0947
9.147 24.0 72 4.0337 0.1089 0.0466 0.0961 0.1016
8.8825 25.0 75 3.9481 0.1026 0.0436 0.0911 0.0958
8.6989 26.0 78 3.8586 0.1082 0.0477 0.0955 0.1004
8.4876 27.0 81 3.7634 0.1088 0.0479 0.0960 0.1008
8.3068 28.0 84 3.6638 0.1119 0.0432 0.0943 0.1017
8.1205 29.0 87 3.5645 0.1067 0.0409 0.0900 0.0973
7.985 30.0 90 3.4753 0.1205 0.0425 0.0991 0.1075
7.8097 31.0 93 3.4016 0.1024 0.0422 0.0855 0.0925
7.6231 32.0 96 3.3417 0.1108 0.0415 0.0910 0.0985
7.5175 33.0 99 3.2915 0.0932 0.0333 0.0757 0.0833
7.3729 34.0 102 3.2466 0.0965 0.0335 0.0779 0.0864
7.1983 35.0 105 3.2033 0.0967 0.0333 0.0781 0.0859
7.0976 36.0 108 3.1594 0.0861 0.0293 0.0711 0.0760
6.9906 37.0 111 3.1135 0.0861 0.0293 0.0711 0.0760
6.8668 38.0 114 3.0645 0.0910 0.0304 0.0762 0.0812
6.7592 39.0 117 3.0116 0.0985 0.0314 0.0823 0.0873
6.6407 40.0 120 2.9563 0.1002 0.0304 0.0837 0.0887
6.5709 41.0 123 2.9008 0.0981 0.0295 0.0831 0.0871
6.4844 42.0 126 2.8449 0.1117 0.0370 0.0939 0.0979
6.3734 43.0 129 2.7906 0.1034 0.0351 0.0864 0.0907
6.2984 44.0 132 2.7383 0.0929 0.0322 0.0795 0.0826
6.203 45.0 135 2.6884 0.1004 0.0350 0.0840 0.0892
6.1458 46.0 138 2.6409 0.1054 0.0354 0.0885 0.0935
6.0371 47.0 141 2.5963 0.0916 0.0315 0.0783 0.0831
5.9271 48.0 144 2.5539 0.0775 0.0312 0.0657 0.0709
5.9151 49.0 147 2.5119 0.0842 0.0323 0.0717 0.0772
5.7851 50.0 150 2.4719 0.0893 0.0312 0.0726 0.0786
5.6981 51.0 153 2.4329 0.0789 0.0245 0.0634 0.0694
5.6468 52.0 156 2.3949 0.0868 0.0273 0.0677 0.0751
5.5874 53.0 159 2.3584 0.0868 0.0273 0.0677 0.0751
5.5238 54.0 162 2.3235 0.1064 0.0309 0.0853 0.0941
5.4892 55.0 165 2.2884 0.1061 0.0313 0.0851 0.0950
5.3768 56.0 168 2.2536 0.1061 0.0313 0.0851 0.0950
5.2819 57.0 171 2.2191 0.1061 0.0313 0.0851 0.0950
5.2262 58.0 174 2.1852 0.1428 0.0442 0.1169 0.1310
5.2138 59.0 177 2.1516 0.1436 0.0406 0.1190 0.1322
5.1761 60.0 180 2.1187 0.1621 0.0517 0.1353 0.1491
5.1171 61.0 183 2.0865 0.1798 0.0564 0.1551 0.1679
5.0242 62.0 186 2.0561 0.1858 0.0582 0.1624 0.1748
4.9751 63.0 189 2.0281 0.2180 0.0703 0.1937 0.2056
4.9801 64.0 192 2.0015 0.2425 0.0763 0.2178 0.2297
4.8846 65.0 195 1.9753 0.2721 0.0891 0.2489 0.2600
4.8342 66.0 198 1.9501 0.2923 0.0940 0.2652 0.2783
4.8251 67.0 201 1.9253 0.3031 0.0969 0.2759 0.2883
4.7383 68.0 204 1.9017 0.3175 0.1024 0.2882 0.3027
4.7067 69.0 207 1.8781 0.3314 0.1042 0.2996 0.3144
4.6585 70.0 210 1.8545 0.3303 0.1029 0.2995 0.3138
4.5942 71.0 213 1.8311 0.3281 0.1015 0.2993 0.3130
4.554 72.0 216 1.8087 0.3281 0.1015 0.2993 0.3130
4.5399 73.0 219 1.7882 0.3281 0.1015 0.2993 0.3130
4.5289 74.0 222 1.7686 0.3300 0.1048 0.3018 0.3164
4.4602 75.0 225 1.7503 0.3301 0.1053 0.3021 0.3164
4.4387 76.0 228 1.7330 0.3301 0.1053 0.3021 0.3164
4.391 77.0 231 1.7164 0.3288 0.1045 0.3012 0.3158
4.3463 78.0 234 1.7008 0.3401 0.1067 0.3106 0.3253
4.2849 79.0 237 1.6855 0.3467 0.1082 0.3139 0.3309
4.3224 80.0 240 1.6701 0.3467 0.1082 0.3139 0.3309
4.2964 81.0 243 1.6547 0.3467 0.1082 0.3139 0.3309
4.2256 82.0 246 1.6403 0.3467 0.1082 0.3139 0.3309
4.2265 83.0 249 1.6268 0.3474 0.1092 0.3144 0.3316
4.1872 84.0 252 1.6139 0.3468 0.1092 0.3140 0.3310
4.1531 85.0 255 1.6014 0.3529 0.1152 0.3158 0.3370
4.1586 86.0 258 1.5896 0.3529 0.1152 0.3158 0.3370
4.1404 87.0 261 1.5787 0.3529 0.1152 0.3158 0.3370
4.0821 88.0 264 1.5687 0.3529 0.1152 0.3158 0.3370
4.1139 89.0 267 1.5593 0.3557 0.1174 0.3174 0.3401
4.0456 90.0 270 1.5503 0.3587 0.1180 0.3201 0.3426
4.0538 91.0 273 1.5418 0.3587 0.1180 0.3201 0.3426
4.072 92.0 276 1.5339 0.3587 0.1180 0.3201 0.3426
3.9976 93.0 279 1.5264 0.3587 0.1180 0.3201 0.3426
4.0005 94.0 282 1.5194 0.3587 0.1180 0.3201 0.3426
3.9937 95.0 285 1.5132 0.3587 0.1180 0.3201 0.3426
3.9968 96.0 288 1.5077 0.3587 0.1180 0.3201 0.3426
3.9667 97.0 291 1.5027 0.3588 0.1173 0.3192 0.3430
3.9796 98.0 294 1.4982 0.3588 0.1173 0.3192 0.3430
3.9699 99.0 297 1.4941 0.3588 0.1173 0.3192 0.3430
3.8984 100.0 300 1.4903 0.3588 0.1173 0.3192 0.3430
3.898 101.0 303 1.4870 0.3588 0.1173 0.3192 0.3430
3.9476 102.0 306 1.4842 0.3588 0.1173 0.3192 0.3430
3.9353 103.0 309 1.4818 0.3588 0.1173 0.3192 0.3430
3.9377 104.0 312 1.4800 0.3588 0.1173 0.3192 0.3430
3.9396 105.0 315 1.4788 0.3588 0.1173 0.3192 0.3430
3.9439 106.0 318 1.4780 0.3588 0.1173 0.3192 0.3430
7.8954 106.8 320 1.4778 0.3588 0.1173 0.3192 0.3430

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
114
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for devagonal/flan-t5-rouge-squad-qg-testb

Finetuned
(322)
this model