PoliteT5Small

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.3497
Toxicity Ratio: 0.2368

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 75

Training results

Training Loss	Epoch	Step	Validation Loss	Toxicity Ratio
No log	1.0	44	0.6088	0.2544
1.2582	2.0	88	0.6911	0.2456
0.7031	3.0	132	0.7221	0.2193
0.5203	4.0	176	0.6663	0.2368
0.4341	5.0	220	0.7135	0.2456
0.3574	6.0	264	0.7515	0.2281
0.3082	7.0	308	0.7641	0.2719
0.2842	8.0	352	0.8101	0.2719
0.2842	9.0	396	0.8355	0.2456
0.2331	10.0	440	0.8099	0.2281
0.2863	11.0	484	0.8992	0.2719
0.1934	12.0	528	0.8319	0.2368
0.1935	13.0	572	0.9640	0.2632
0.1615	14.0	616	0.9779	0.2719
0.159	15.0	660	0.9226	0.2719
0.1423	16.0	704	0.8809	0.2632
0.1423	17.0	748	0.9373	0.2193
0.1458	18.0	792	0.9107	0.2807
0.1316	19.0	836	0.9115	0.2281
0.1253	20.0	880	0.8714	0.2281
0.1096	21.0	924	0.9176	0.2632
0.1136	22.0	968	1.0679	0.2719
0.0948	23.0	1012	0.9609	0.2281
0.0932	24.0	1056	0.9093	0.2544
0.0898	25.0	1100	0.9471	0.2368
0.0898	26.0	1144	0.9743	0.2719
0.0896	27.0	1188	0.9122	0.2456
0.0788	28.0	1232	0.9918	0.2544
0.0813	29.0	1276	0.9951	0.2544
0.0665	30.0	1320	1.0162	0.2456
0.0629	31.0	1364	1.0298	0.2368
0.0587	32.0	1408	1.0423	0.2368
0.0508	33.0	1452	1.0082	0.2895
0.0508	34.0	1496	0.9940	0.2368
0.0456	35.0	1540	1.0864	0.2895
0.0414	36.0	1584	1.0233	0.2807
0.0472	37.0	1628	1.0083	0.2719
0.0441	38.0	1672	1.0161	0.2632
0.037	39.0	1716	1.0209	0.2719
0.0303	40.0	1760	1.1103	0.2895
0.0288	41.0	1804	1.2275	0.2632
0.0288	42.0	1848	1.1029	0.2895
0.0255	43.0	1892	1.0100	0.2719
0.0196	44.0	1936	1.1369	0.2544
0.0161	45.0	1980	1.2074	0.2632
0.019	46.0	2024	1.1192	0.2281
0.0165	47.0	2068	1.1283	0.3246
0.019	48.0	2112	1.1478	0.2368
0.0177	49.0	2156	1.1950	0.2544
0.0153	50.0	2200	1.0713	0.2719
0.0153	51.0	2244	1.2038	0.2544
0.0135	52.0	2288	1.1361	0.2368
0.0096	53.0	2332	1.1446	0.2632
0.0106	54.0	2376	1.1622	0.2632
0.0107	55.0	2420	1.2000	0.2807
0.0061	56.0	2464	1.3061	0.2719
0.0091	57.0	2508	1.1605	0.2368
0.0051	58.0	2552	1.1857	0.2456
0.0051	59.0	2596	1.2568	0.2368
0.0045	60.0	2640	1.2608	0.2544
0.0039	61.0	2684	1.2377	0.2544
0.0049	62.0	2728	1.1845	0.2368
0.0029	63.0	2772	1.3083	0.2281
0.0026	64.0	2816	1.2598	0.2544
0.0021	65.0	2860	1.3106	0.2632
0.0022	66.0	2904	1.2980	0.2368
0.0022	67.0	2948	1.3119	0.2544
0.0024	68.0	2992	1.2976	0.2193
0.0021	69.0	3036	1.2685	0.2456
0.002	70.0	3080	1.3019	0.2544
0.0012	71.0	3124	1.3376	0.2456
0.0012	72.0	3168	1.3415	0.2456
0.0011	73.0	3212	1.3483	0.2281
0.0012	74.0	3256	1.3545	0.2368
0.0025	75.0	3300	1.3497	0.2368

Framework versions

Transformers 4.47.0
Pytorch 2.5.1+cu121
Datasets 3.3.1
Tokenizers 0.21.0

AayushW
/

PoliteT5Small

PoliteT5Small

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for AayushW/PoliteT5Small

Evaluation results