01yubin
/

llama-fine-tuned

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

llama-fine-tuned / README.md

01yubin's picture

End of training

5145f85 verified 3 months ago

|

history blame contribute delete

2.85 kB

	---
	library_name: peft
	license: llama3.2
	base_model: meta-llama/Llama-3.2-1B-Instruct
	tags:
	- generated_from_trainer
	model-index:
	- name: llama-fine-tuned
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# llama-fine-tuned

	This model is a fine-tuned version of [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3503

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 2
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 1
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 1.6035 \| 0.0357 \| 50 \| 0.9702 \|
	\| 0.6392 \| 0.0714 \| 100 \| 0.6272 \|
	\| 0.5979 \| 0.1071 \| 150 \| 0.5398 \|
	\| 0.5629 \| 0.1429 \| 200 \| 0.5044 \|
	\| 0.4761 \| 0.1786 \| 250 \| 0.4689 \|
	\| 0.4998 \| 0.2143 \| 300 \| 0.4494 \|
	\| 0.4363 \| 0.25 \| 350 \| 0.4524 \|
	\| 0.4433 \| 0.2857 \| 400 \| 0.4322 \|
	\| 0.4882 \| 0.3214 \| 450 \| 0.4135 \|
	\| 0.4316 \| 0.3571 \| 500 \| 0.4017 \|
	\| 0.389 \| 0.3929 \| 550 \| 0.3951 \|
	\| 0.4041 \| 0.4286 \| 600 \| 0.3908 \|
	\| 0.456 \| 0.4643 \| 650 \| 0.3860 \|
	\| 0.3872 \| 0.5 \| 700 \| 0.3788 \|
	\| 0.3962 \| 0.5357 \| 750 \| 0.3792 \|
	\| 0.3524 \| 0.5714 \| 800 \| 0.3762 \|
	\| 0.3409 \| 0.6071 \| 850 \| 0.3700 \|
	\| 0.421 \| 0.6429 \| 900 \| 0.3746 \|
	\| 0.349 \| 0.6786 \| 950 \| 0.3634 \|
	\| 0.4194 \| 0.7143 \| 1000 \| 0.3665 \|
	\| 0.3621 \| 0.75 \| 1050 \| 0.3607 \|
	\| 0.3663 \| 0.7857 \| 1100 \| 0.3603 \|
	\| 0.3434 \| 0.8214 \| 1150 \| 0.3592 \|
	\| 0.3609 \| 0.8571 \| 1200 \| 0.3553 \|
	\| 0.342 \| 0.8929 \| 1250 \| 0.3524 \|
	\| 0.3889 \| 0.9286 \| 1300 \| 0.3513 \|
	\| 0.3604 \| 0.9643 \| 1350 \| 0.3508 \|
	\| 0.354 \| 1.0 \| 1400 \| 0.3503 \|


	### Framework versions

	- PEFT 0.13.2
	- Transformers 4.47.0
	- Pytorch 2.5.1+cu121
	- Tokenizers 0.21.0