MammoLLM

This model is a fine-tuned version of gpt2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
1.5202	0.62	500	1.1041
1.1505	1.24	1000	1.0581
1.1018	1.87	1500	1.0286
1.0694	2.49	2000	1.0166
1.0574	3.11	2500	1.0041
1.0351	3.73	3000	0.9909
1.0193	4.36	3500	0.9865
1.0137	4.98	4000	0.9799
0.993	5.6	4500	0.9745
0.9813	6.22	5000	0.9632
0.9728	6.85	5500	0.9573
0.9534	7.47	6000	0.9521
0.9474	8.09	6500	0.9481
0.9264	8.71	7000	0.9405
0.9099	9.33	7500	0.9365
0.9017	9.96	8000	0.9292
0.8735	10.58	8500	0.9267
0.8623	11.2	9000	0.9268
0.8444	11.82	9500	0.9168
0.8205	12.45	10000	0.9148
0.8111	13.07	10500	0.9129
0.7842	13.69	11000	0.9129
0.767	14.31	11500	0.9138
0.759	14.93	12000	0.9094
0.7329	15.56	12500	0.9109
0.7261	16.18	13000	0.9145
0.7121	16.8	13500	0.9145
0.7038	17.42	14000	0.9161
0.699	18.05	14500	0.9167
0.6902	18.67	15000	0.9169
0.6883	19.29	15500	0.9172
0.6873	19.91	16000	0.9172

Base model

Finetuned

(2027)

this model