PoliteT5Small

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3497
  • Toxicity Ratio: 0.2368

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 75

Training results

Training Loss Epoch Step Validation Loss Toxicity Ratio
No log 1.0 44 0.6088 0.2544
1.2582 2.0 88 0.6911 0.2456
0.7031 3.0 132 0.7221 0.2193
0.5203 4.0 176 0.6663 0.2368
0.4341 5.0 220 0.7135 0.2456
0.3574 6.0 264 0.7515 0.2281
0.3082 7.0 308 0.7641 0.2719
0.2842 8.0 352 0.8101 0.2719
0.2842 9.0 396 0.8355 0.2456
0.2331 10.0 440 0.8099 0.2281
0.2863 11.0 484 0.8992 0.2719
0.1934 12.0 528 0.8319 0.2368
0.1935 13.0 572 0.9640 0.2632
0.1615 14.0 616 0.9779 0.2719
0.159 15.0 660 0.9226 0.2719
0.1423 16.0 704 0.8809 0.2632
0.1423 17.0 748 0.9373 0.2193
0.1458 18.0 792 0.9107 0.2807
0.1316 19.0 836 0.9115 0.2281
0.1253 20.0 880 0.8714 0.2281
0.1096 21.0 924 0.9176 0.2632
0.1136 22.0 968 1.0679 0.2719
0.0948 23.0 1012 0.9609 0.2281
0.0932 24.0 1056 0.9093 0.2544
0.0898 25.0 1100 0.9471 0.2368
0.0898 26.0 1144 0.9743 0.2719
0.0896 27.0 1188 0.9122 0.2456
0.0788 28.0 1232 0.9918 0.2544
0.0813 29.0 1276 0.9951 0.2544
0.0665 30.0 1320 1.0162 0.2456
0.0629 31.0 1364 1.0298 0.2368
0.0587 32.0 1408 1.0423 0.2368
0.0508 33.0 1452 1.0082 0.2895
0.0508 34.0 1496 0.9940 0.2368
0.0456 35.0 1540 1.0864 0.2895
0.0414 36.0 1584 1.0233 0.2807
0.0472 37.0 1628 1.0083 0.2719
0.0441 38.0 1672 1.0161 0.2632
0.037 39.0 1716 1.0209 0.2719
0.0303 40.0 1760 1.1103 0.2895
0.0288 41.0 1804 1.2275 0.2632
0.0288 42.0 1848 1.1029 0.2895
0.0255 43.0 1892 1.0100 0.2719
0.0196 44.0 1936 1.1369 0.2544
0.0161 45.0 1980 1.2074 0.2632
0.019 46.0 2024 1.1192 0.2281
0.0165 47.0 2068 1.1283 0.3246
0.019 48.0 2112 1.1478 0.2368
0.0177 49.0 2156 1.1950 0.2544
0.0153 50.0 2200 1.0713 0.2719
0.0153 51.0 2244 1.2038 0.2544
0.0135 52.0 2288 1.1361 0.2368
0.0096 53.0 2332 1.1446 0.2632
0.0106 54.0 2376 1.1622 0.2632
0.0107 55.0 2420 1.2000 0.2807
0.0061 56.0 2464 1.3061 0.2719
0.0091 57.0 2508 1.1605 0.2368
0.0051 58.0 2552 1.1857 0.2456
0.0051 59.0 2596 1.2568 0.2368
0.0045 60.0 2640 1.2608 0.2544
0.0039 61.0 2684 1.2377 0.2544
0.0049 62.0 2728 1.1845 0.2368
0.0029 63.0 2772 1.3083 0.2281
0.0026 64.0 2816 1.2598 0.2544
0.0021 65.0 2860 1.3106 0.2632
0.0022 66.0 2904 1.2980 0.2368
0.0022 67.0 2948 1.3119 0.2544
0.0024 68.0 2992 1.2976 0.2193
0.0021 69.0 3036 1.2685 0.2456
0.002 70.0 3080 1.3019 0.2544
0.0012 71.0 3124 1.3376 0.2456
0.0012 72.0 3168 1.3415 0.2456
0.0011 73.0 3212 1.3483 0.2281
0.0012 74.0 3256 1.3545 0.2368
0.0025 75.0 3300 1.3497 0.2368

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
77M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AayushW/PoliteT5Small

Finetuned
(419)
this model