Built with Axolotl

vicuna_7b_stage1

This model is a fine-tuned version of lmsys/vicuna-7b-v1.5 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 40
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
3.0038 0.0410 40 nan
2.881 0.0821 80 nan
2.7344 0.1231 120 nan
2.7523 0.1641 160 nan
2.8974 0.2051 200 nan
2.8822 0.2462 240 nan
2.8679 0.2872 280 nan
2.8764 0.3282 320 nan
2.7328 0.3692 360 nan
2.6559 0.4103 400 nan
2.9797 0.4513 440 nan
2.6416 0.4923 480 nan
2.5732 0.5333 520 nan
2.5785 0.5744 560 nan
2.6892 0.6154 600 nan
2.5908 0.6564 640 nan
2.5158 0.6974 680 nan
2.43 0.7385 720 nan
2.4511 0.7795 760 nan
2.3835 0.8205 800 nan
2.1151 0.8615 840 nan
2.4363 0.9026 880 nan
2.3235 0.9436 920 nan
2.2581 0.9846 960 nan
2.1462 1.0256 1000 nan
2.163 1.0667 1040 nan
2.2433 1.1077 1080 nan
2.0609 1.1487 1120 nan
1.9598 1.1897 1160 nan
2.1146 1.2308 1200 nan
2.0555 1.2718 1240 nan
1.8906 1.3128 1280 nan
1.8369 1.3538 1320 nan
1.9503 1.3949 1360 nan
1.8217 1.4359 1400 nan
1.9437 1.4769 1440 nan
1.7392 1.5179 1480 nan
1.7494 1.5590 1520 nan
1.7624 1.6 1560 nan
1.7191 1.6410 1600 nan
1.8216 1.6821 1640 nan
1.7724 1.7231 1680 nan
1.78 1.7641 1720 nan
1.5865 1.8051 1760 nan
1.7514 1.8462 1800 nan
1.7128 1.8872 1840 nan
1.8041 1.9282 1880 nan
1.6393 1.9692 1920 nan

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.6.0.dev20241122+rocm6.2
  • Datasets 2.14.7
  • Tokenizers 0.20.3
Downloads last month
124
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Yingbing/vicuna_7b_medusa_snapkv

Finetuned
(50)
this model