calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.3906	1.0	6	2.7256
2.3769	2.0	12	1.9990
1.861	3.0	18	1.6958
1.6322	4.0	24	1.5805
1.5721	5.0	30	1.5422
1.5184	6.0	36	1.5180
1.4832	7.0	42	1.5010
1.4849	8.0	48	1.5123
1.462	9.0	54	1.4417
1.4298	10.0	60	1.4435
1.4279	11.0	66	1.4251
1.4448	12.0	72	1.3547
1.3379	13.0	78	1.3969
1.3537	14.0	84	1.2676
1.2642	15.0	90	1.1872
1.2176	16.0	96	1.1717
1.1637	17.0	102	1.1720
1.1542	18.0	108	1.0787
1.0647	19.0	114	1.0448
1.0532	20.0	120	0.9690
0.9851	21.0	126	1.0202
1.0039	22.0	132	0.9606
0.9611	23.0	138	0.9038
0.945	24.0	144	0.9449
0.9189	25.0	150	0.8462
0.87	26.0	156	0.8271
0.8647	27.0	162	0.8263
0.8469	28.0	168	0.7877
0.8159	29.0	174	0.7557
0.7941	30.0	180	0.7365
0.7619	31.0	186	0.7333
0.7643	32.0	192	0.7197
0.7689	33.0	198	0.7006
0.7508	34.0	204	0.6877
0.7349	35.0	210	0.6797
0.7214	36.0	216	0.6733
0.7172	37.0	222	0.6600
0.7152	38.0	228	0.6582
0.6935	39.0	234	0.6510
0.696	40.0	240	0.6459

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support