rhaymison's picture
End of training
eebc1dd verified
|
raw
history blame
5.75 kB
metadata
license: apache-2.0
base_model: google-t5/t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-small-summarization
    results: []

flan-t5-small-summarization

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9716
  • Rouge1: 14.8237
  • Rouge2: 5.3275
  • Rougel: 12.6729
  • Rougelsum: 13.6266
  • Gen Len: 18.968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.12 100 2.0773 15.1231 5.4025 12.9496 13.9319 18.94
No log 0.24 200 2.0736 14.7565 5.2799 12.6268 13.5578 18.94
No log 0.36 300 2.0632 14.8383 5.2319 12.6555 13.6597 18.968
No log 0.48 400 2.0629 14.8558 5.2815 12.6581 13.6503 18.968
2.2157 0.6 500 2.0583 14.8736 5.3228 12.649 13.6717 18.968
2.2157 0.72 600 2.0520 14.8178 5.3112 12.586 13.6262 18.968
2.2157 0.84 700 2.0467 14.9042 5.3468 12.6543 13.6596 18.968
2.2157 0.96 800 2.0435 14.8682 5.3287 12.661 13.6869 18.968
2.2157 1.08 900 2.0375 14.9469 5.362 12.7083 13.7525 18.968
2.1846 1.2 1000 2.0324 14.8316 5.3471 12.6593 13.6452 18.968
2.1846 1.32 1100 2.0309 14.6717 5.2555 12.5319 13.4962 18.968
2.1846 1.44 1200 2.0189 14.8455 5.3386 12.6002 13.6588 18.968
2.1846 1.56 1300 2.0182 14.9323 5.3902 12.7187 13.7579 18.968
2.1846 1.68 1400 2.0172 14.969 5.4698 12.8021 13.8116 18.968
2.1596 1.8 1500 2.0105 15.0152 5.5355 12.8098 13.8475 18.968
2.1596 1.92 1600 2.0100 15.0009 5.3835 12.764 13.785 18.968
2.1596 2.04 1700 2.0083 14.8145 5.2912 12.6179 13.6279 18.968
2.1596 2.16 1800 2.0035 14.8232 5.2131 12.6386 13.6297 18.968
2.1596 2.28 1900 2.0006 14.8076 5.2617 12.6578 13.6631 18.968
2.1405 2.4 2000 1.9983 14.6508 5.0855 12.4956 13.4989 18.968
2.1405 2.52 2100 1.9965 14.9548 5.2857 12.6947 13.7664 18.968
2.1405 2.64 2200 1.9917 14.8786 5.2212 12.6813 13.6609 18.968
2.1405 2.76 2300 1.9904 15.0902 5.4835 12.8911 13.9191 18.968
2.1405 2.88 2400 1.9880 14.8188 5.2057 12.6325 13.6335 18.968
2.1287 3.0 2500 1.9844 14.7362 5.2487 12.6559 13.64 18.968
2.1287 3.12 2600 1.9834 14.9356 5.3404 12.7325 13.7185 18.968
2.1287 3.24 2700 1.9839 14.9543 5.4587 12.757 13.767 18.968
2.1287 3.36 2800 1.9821 14.8174 5.2522 12.6935 13.6292 18.968
2.1287 3.48 2900 1.9816 14.8201 5.2606 12.6679 13.6275 18.968
2.1149 3.6 3000 1.9795 14.8112 5.253 12.5789 13.5714 18.968
2.1149 3.72 3100 1.9788 14.7946 5.3272 12.6237 13.614 18.968
2.1149 3.84 3200 1.9761 14.8197 5.295 12.6209 13.6327 18.968
2.1149 3.96 3300 1.9761 14.7752 5.2759 12.6239 13.6167 18.968
2.1149 4.08 3400 1.9714 14.7938 5.2988 12.7085 13.6708 18.968
2.1138 4.2 3500 1.9729 14.8006 5.2526 12.6427 13.6018 18.968
2.1138 4.32 3600 1.9751 14.7531 5.2913 12.6372 13.5782 18.968
2.1138 4.44 3700 1.9743 14.7556 5.2694 12.6372 13.5786 18.968
2.1138 4.56 3800 1.9710 14.8124 5.2887 12.7095 13.6666 18.968
2.1138 4.68 3900 1.9725 14.7104 5.2357 12.5839 13.5364 18.968
2.1033 4.8 4000 1.9726 14.7673 5.2771 12.6343 13.5731 18.968
2.1033 4.92 4100 1.9716 14.8237 5.3275 12.6729 13.6266 18.968

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2