Built with Axolotl

6a7994be-dc68-456f-8f77-140704ee2feb

This model is a fine-tuned version of fxmarty/really-tiny-falcon-testing on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 10.9513

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.000211
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
No log 0.0009 1 11.0891
22.0247 0.0456 50 11.0057
21.973 0.0912 100 10.9840
21.9541 0.1367 150 10.9738
21.9397 0.1823 200 10.9663
21.9218 0.2279 250 10.9605
21.9189 0.2735 300 10.9574
21.917 0.3191 350 10.9544
21.9109 0.3646 400 10.9532
21.9099 0.4102 450 10.9515
21.9095 0.4558 500 10.9513

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for lesso11/6a7994be-dc68-456f-8f77-140704ee2feb

Adapter
(262)
this model