RylanSchaeffer
/

collapse_gemma-2-2b_hs2_replace_iter11_sftsd0

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_replace_iter11_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.5435
Num Input Tokens Seen: 4784104

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 0
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.5272	0.0511	5	1.2822	252768
0.8166	0.1022	10	1.3380	509408
0.4892	0.1533	15	1.5438	759784
0.206	0.2043	20	1.7476	1010152
0.2173	0.2554	25	1.9773	1248280
0.0722	0.3065	30	2.1778	1497952
0.0457	0.3576	35	2.3481	1738968
0.0783	0.4087	40	2.4090	1989136
0.0274	0.4598	45	2.4217	2237392
0.0255	0.5109	50	2.4545	2485808
0.0261	0.5619	55	2.4836	2737960
0.0232	0.6130	60	2.4909	2979504
0.0375	0.6641	65	2.4994	3225896
0.028	0.7152	70	2.4842	3463664
0.0235	0.7663	75	2.4755	3711680
0.0217	0.8174	80	2.4925	3954272
0.0213	0.8685	85	2.5131	4204248
0.0204	0.9195	90	2.5247	4446520
0.0226	0.9706	95	2.5365	4686904

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 5

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter11_sftsd0

Base model

google/gemma-2-2b

Finetuned

(491)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard