File size: 5,651 Bytes
f020d6e ef51b5b f020d6e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
---
datasets:
- blesspearl/stackexchange-math-sample
language:
- en
library_name: transformers
license: mit
---
[Guide](https://medium.com/@rajatsharma_33357/fine-tuning-llama-using-lora-fb3f48a557d5)
# Fine-Tuned LLaMA 3.1 Model on Stack Exchange Math Dataset
This repository contains the fine-tuned LLaMA 3.1 model using LoRA on a dataset collected from Stack Exchange Math. The model is designed to answer mathematical questions in a manner similar to Stack Exchange responses.
## Model Details
- **Base Model:** [Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)
- **Fine-Tuned Model:** [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange)
- **Dataset:** [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample)
- **Training Environment:**
- Framework: PyTorch with Transformers
- Platform: Google Colab
- Hardware: 1 x T4 GPU (15GB)
## Data Preparation
The dataset used for fine-tuning includes 1000 samples collected from Stack Exchange Math. Each sample consists of a question and its accepted answer.
### Preprocessing
The data was preprocessed using the following steps:
1. Loading the dataset from Hugging Face.
2. Shuffling the dataset and selecting 1000 samples.
3. Formatting the data into a chat template suitable for training.
## Training Details
### Libraries and Dependencies
```python
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
from google.colab import drive, userdata
import os, torch, wandb
from trl import SFTTrainer, setup_chat_format
from huggingface_hub import login
```
### Loading Data and Model
```python
model_name = "meta-llama/Meta-Llama-3.1-8B"
dataset_name = "blesspearl/stackexchange-math-sample"
torch_dtype = torch.float16
attn_implementation = "eager"
wandb.login(key=userdata.get("WANDB_API_KEY"))
run = wandb.init(
project='Fine tunning LLama-3.1-8b on math-stack-exchange',
job_type="training",
anonymous="allow"
)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch_dtype,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
device_map="auto",
attn_implementation=attn_implementation
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
model, tokenizer = setup_chat_format(model, tokenizer)
```
### LoRA Configuration
```python
peft_config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
)
model = get_peft_model(model, peft_config)
```
### Data Preparation
```python
dataset = load_dataset(dataset_name, split="all")
dataset = dataset.shuffle(seed=65).select(range(1000))
def format_chat_template(row):
row_json = [{"role": "user", "content": row["question_body"]},
{"role": "assistant", "content": row["accepted_answer"]}]
row["text"] = tokenizer.apply_chat_template(row_json, tokenize=False)
return row
dataset = dataset.map(format_chat_template, num_proc=4)
dataset = dataset.train_test_split(test_size=0.2)
dataset = dataset.remove_columns(["question_body", "accepted_answer"])
```
### Training Configuration
```python
training_arguments = TrainingArguments(
output_dir="math-stackexchange",
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
gradient_accumulation_steps=2,
optim="paged_adamw_32bit",
num_train_epochs=1,
evaluation_strategy="steps",
eval_steps=0.2,
logging_steps=1,
warmup_steps=10,
logging_strategy="steps",
learning_rate=2e-4,
fp16=False,
bf16=False,
group_by_length=True,
report_to="wandb"
)
trainer = SFTTrainer(
model=model,
train_dataset=dataset["train"],
eval_dataset=dataset["test"],
peft_config=peft_config,
max_seq_length=512,
dataset_text_field="text",
tokenizer=tokenizer,
args=training_arguments,
packing=False,
)
trainer.train()
wandb.finish()
model.config.use_cache = True
```
## Model and Dataset
- **Model:** [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange)
- **Dataset:** [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample)
## Usage
To use the fine-tuned model for inference, you can load it using the Hugging Face Transformers library and pass in your data for querying.
### Example Code
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "blesspearl/math-stackexchange"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def answer_question(question):
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
return answer
question = "What is the derivative of sin(x)?"
answer = answer_question(question)
print(answer)
```
## Conclusion
This documentation provides an overview of the fine-tuning process of the LLaMA 3.1 model using LoRA on the Stack Exchange Math dataset. The model and dataset are available on Hugging Face for further use and exploration.
For any questions or issues, feel free to open an issue on the [model repository](https://huggingface.co/blesspearl/math-stackexchange).
|