blesspearl
/

math-stackexchange

+---
+license: mit
+datasets:
+- blesspearl/stackexchange-math-sample
+language:
+- en
+library_name: transformers
+---
+[Guide](https://medium.com/@rajatsharma_33357/fine-tuning-llama-using-lora-fb3f48a557d5)
+# Fine-Tuned LLaMA 3.1 Model on Stack Exchange Math Dataset
+This repository contains the fine-tuned LLaMA 3.1 model using LoRA on a dataset collected from Stack Exchange Math. The model is designed to answer mathematical questions in a manner similar to Stack Exchange responses.
+## Model Details
+- **Base Model:** [Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)
+- **Fine-Tuned Model:** [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange)
+- **Dataset:** [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample)
+- **Training Environment:**
+  - Framework: PyTorch with Transformers
+  - Platform: Google Colab
+  - Hardware: 1 x T4 GPU (15GB)
+## Data Preparation
+The dataset used for fine-tuning includes 1000 samples collected from Stack Exchange Math. Each sample consists of a question and its accepted answer.
+### Preprocessing
+The data was preprocessed using the following steps:
+1. Loading the dataset from Hugging Face.
+2. Shuffling the dataset and selecting 1000 samples.
+3. Formatting the data into a chat template suitable for training.
+## Training Details
+### Libraries and Dependencies
+```python
+from datasets import load_dataset
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging
+from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
+from google.colab import drive, userdata
+import os, torch, wandb
+from trl import SFTTrainer, setup_chat_format
+from huggingface_hub import login
+```
+### Loading Data and Model
+```python
+model_name = "meta-llama/Meta-Llama-3.1-8B"
+dataset_name = "blesspearl/stackexchange-math-sample"
+torch_dtype = torch.float16
+attn_implementation = "eager"
+wandb.login(key=userdata.get("WANDB_API_KEY"))
+run = wandb.init(
+    project='Fine tunning LLama-3.1-8b on math-stack-exchange',
+    job_type="training",
+    anonymous="allow"
+)
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch_dtype,
+    bnb_4bit_use_double_quant=True,
+)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    quantization_config=bnb_config,
+    device_map="auto",
+    attn_implementation=attn_implementation
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model, tokenizer = setup_chat_format(model, tokenizer)
+```
+### LoRA Configuration
+```python
+peft_config = LoraConfig(
+    r=16,
+    lora_alpha=32,
+    lora_dropout=0.05,
+    bias="none",
+    task_type="CAUSAL_LM",
+    target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
+)
+model = get_peft_model(model, peft_config)
+```
+### Data Preparation
+```python
+dataset = load_dataset(dataset_name, split="all")
+dataset = dataset.shuffle(seed=65).select(range(1000))
+def format_chat_template(row):
+    row_json = [{"role": "user", "content": row["question_body"]},
+                {"role": "assistant", "content": row["accepted_answer"]}]
+    row["text"] = tokenizer.apply_chat_template(row_json, tokenize=False)
+    return row
+dataset = dataset.map(format_chat_template, num_proc=4)
+dataset = dataset.train_test_split(test_size=0.2)
+dataset = dataset.remove_columns(["question_body", "accepted_answer"])
+```
+### Training Configuration
+```python
+training_arguments = TrainingArguments(
+    output_dir="math-stackexchange",
+    per_device_train_batch_size=1,
+    per_device_eval_batch_size=1,
+    gradient_accumulation_steps=2,
+    optim="paged_adamw_32bit",
+    num_train_epochs=1,
+    evaluation_strategy="steps",
+    eval_steps=0.2,
+    logging_steps=1,
+    warmup_steps=10,
+    logging_strategy="steps",
+    learning_rate=2e-4,
+    fp16=False,
+    bf16=False,
+    group_by_length=True,
+    report_to="wandb"
+)
+trainer = SFTTrainer(
+    model=model,
+    train_dataset=dataset["train"],
+    eval_dataset=dataset["test"],
+    peft_config=peft_config,
+    max_seq_length=512,
+    dataset_text_field="text",
+    tokenizer=tokenizer,
+    args=training_arguments,
+    packing=False,
+)
+trainer.train()
+wandb.finish()
+model.config.use_cache = True
+```
+## Model and Dataset
+- **Model:** [math-stackexchange](https://huggingface.co/blesspearl/math-stackexchange)
+- **Dataset:** [stackexchange-math-sample](https://huggingface.co/datasets/blesspearl/stackexchange-math-sample)
+## Usage
+To use the fine-tuned model for inference, you can load it using the Hugging Face Transformers library and pass in your data for querying.
+### Example Code
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "blesspearl/math-stackexchange"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+def answer_question(question):
+    inputs = tokenizer(question, return_tensors="pt")
+    outputs = model.generate(**inputs)
+    answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
+    return answer
+question = "What is the derivative of sin(x)?"
+answer = answer_question(question)
+print(answer)
+```
+## Conclusion
+This documentation provides an overview of the fine-tuning process of the LLaMA 3.1 model using LoRA on the Stack Exchange Math dataset. The model and dataset are available on Hugging Face for further use and exploration.
+For any questions or issues, feel free to open an issue on the [model repository](https://huggingface.co/blesspearl/math-stackexchange).