Llama3.2-11B-Vision-Instruct-Neutrino

This model is a fine-tuned version of meta-llama/Llama-3.2-11B-Vision-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.0437 0.1001 50 0.0397
0.0077 0.2001 100 0.0195
0.0071 0.3002 150 0.0071
0.0073 0.4002 200 0.0070
0.0068 0.5003 250 0.0068
0.0069 0.6003 300 0.0067
0.0065 0.7004 350 0.0068
0.0066 0.8004 400 0.0066
0.0065 0.9005 450 0.0066
0.0062 1.0008 500 0.0066
0.0069 1.1008 550 0.0068
0.0062 1.2009 600 0.0065
0.0063 1.3009 650 0.0065
0.0065 1.4010 700 0.0065
0.0063 1.5010 750 0.0066
0.0064 1.6011 800 0.0065
0.0066 1.7011 850 0.0064
0.0064 1.8012 900 0.0064
0.0063 1.9012 950 0.0064
0.0064 2.0015 1000 0.0064
0.006 2.1016 1050 0.0064
0.0059 2.2016 1100 0.0064
0.0065 2.3017 1150 0.0064
0.006 2.4017 1200 0.0064
0.0063 2.5018 1250 0.0063
0.0061 2.6018 1300 0.0064
0.0064 2.7019 1350 0.0064
0.0065 2.8019 1400 0.0064
0.0064 2.9020 1450 0.0064

Framework versions

  • PEFT 0.13.0
  • Transformers 4.46.2
  • Pytorch 2.4.1
  • Datasets 3.0.1
  • Tokenizers 0.20.3
Downloads last month
1
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for dikshantsagar/Llama3.2-11B-Vision-Instruct-Neutrino

Adapter
(117)
this model