flan-t5-finetune-v3

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6234
  • Rouge1: 0.3289
  • Rouge2: 0.1822
  • Rougel: 0.3016
  • Rougelsum: 0.3018
  • Gen Len: 19.8627

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.571 0.1739 500 1.0387 0.3684 0.2174 0.337 0.3372 19.727
1.0043 0.3478 1000 0.7315 0.3288 0.1686 0.3007 0.3009 19.5687
0.9122 0.5217 1500 0.6687 0.3276 0.1692 0.2977 0.2979 19.7673
0.8667 0.6956 2000 0.6440 0.3331 0.1795 0.3041 0.3044 19.8282
0.8554 0.8695 2500 0.6328 0.3276 0.1768 0.2997 0.3001 19.8418
0.8431 1.0431 3000 0.6282 0.3334 0.1832 0.3054 0.3057 19.8683
0.8288 1.2170 3500 0.6260 0.3304 0.1821 0.3027 0.303 19.855
0.841 1.3909 4000 0.6243 0.3301 0.1825 0.3026 0.3028 19.8639
0.8423 1.5648 4500 0.6239 0.329 0.1821 0.3021 0.3023 19.8687
0.835 1.7387 5000 0.6235 0.328 0.181 0.3007 0.3011 19.8664
0.8265 1.9126 5500 0.6236 0.3286 0.1822 0.3019 0.3021 19.8621
0.8452 2.0863 6000 0.6235 0.3295 0.1829 0.3023 0.3026 19.865
0.8369 2.2602 6500 0.6235 0.3287 0.182 0.3016 0.3019 19.8674
0.8481 2.4340 7000 0.6235 0.3295 0.1826 0.3024 0.3027 19.8648
0.835 2.6079 7500 0.6234 0.3287 0.1818 0.3014 0.3017 19.865
0.8278 2.7818 8000 0.6234 0.3285 0.1816 0.3015 0.3017 19.8664
0.8257 2.9557 8500 0.6234 0.3289 0.1822 0.3016 0.3018 19.8627

Framework versions

  • Transformers 4.55.4
  • Pytorch 2.6.0+cu124
  • Datasets 2.18.0
  • Tokenizers 0.21.2
Downloads last month
13
Safetensors
Model size
77M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for de-slothbug/flan-t5-finetune-v3

Finetuned
(419)
this model