|
--- |
|
base_model: meta-llama/Llama-3.2-3B-Instruct |
|
datasets: |
|
- tatsu-lab/alpaca |
|
language: en |
|
tags: |
|
- torchtune |
|
--- |
|
|
|
# my_cool_model |
|
|
|
This model is a finetuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the [tatsu-lab/alpaca](https://huggingface.co/tatsu-lab/alpaca) dataset. |
|
|
|
# Model description |
|
|
|
More information needed |
|
|
|
# Training and evaluation results |
|
|
|
More information needed |
|
|
|
# Training procedure |
|
|
|
This model was trained using the [torchtune](https://github.com/pytorch/torchtune) library using the following command: |
|
|
|
```bash |
|
ppo_full_finetune_single_device.py --config \ |
|
./target/7B_full_ppo_low_memory_single_device.yaml device=cuda metric_logger._component_=torchtune.utils.metric_logging.WandBLogger metric_logger.project=torchtune_ppo forward_batch_size=2 batch_size=64 ppo_batch_size=32 gradient_accumulation_steps=16 compile=True optimizer._component_=bitsandbytes.optim.PagedAdamW optimizer.lr=3e-4 |
|
``` |
|
|
|
|
|
|
|
# Framework versions |
|
|
|
- torchtune |
|
- torchao 0.5.0 |
|
- datasets 2.20.0 |
|
- sentencepiece 0.2.0 |
|
|
|
|