metadata

language:
  - en
metrics:
  - bertscore
  - bleu
  - rouge
base_model:
  - meta-llama/Llama-3.2-3B-Instruct
library_name: transformers
tags:
  - text-generation-inference

Model Card

Overview

This repository contains a LoRA-fine-tuned version of Meta's Llama-3.2-3B-Instruct model, trained using PEFT (LoRA) on a custom bank customer-service FAQ dataset for question-answering. The adapter weights, configuration, and tokenizer files are included for seamless inference via a single from_pretrained call.

Model Details

Model Description

Base model: meta-llama/Llama-3.2-3B-Instruct
Method: PEFT (LoRA)
LoRA configuration:
- Rank (r): 8
- Alpha: 32
- Dropout: 0.05
- Target modules: q_proj, v_proj
Task: Customer-service question answering on banking FAQs

Metadata

Developed by: Sardar Taimoor
Finetuned by: SardarTaimoor (https://huggingface.co/SardarTaimoor)
Model type: Causal Language Model
Language(s): English
License: MIT
Finetuned from: meta-llama/Llama-3.2-3B-Instruct

How to Use this Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "SardarTaimoor/llama3b-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",         # splits layers across GPU/CPU
    torch_dtype=torch.float16, # half-precision on GPU
    low_cpu_mem_usage=True     # avoids fully materializing everything in host RAM
)

inputs = tokenizer("What's the Little Champs account?", return_tensors="pt").to(model.device)
out    = model.generate(**inputs, max_new_tokens=50, do_sample=False)
print(tokenizer.decode(out[0], skip_special_tokens=True))
print("-" * 60)

Training Details

Data

Dataset description: Custom customer-service FAQ dataset for bank products, formatted as JSONL with user prompts and assistant completions.
Number of examples: 319 total (train: ~303, validation: ~16 after a 5% split)
Preprocessing steps: Prompts and completions extracted and cleaned from JSONL; tokenization via the original Llama tokenizer.

Procedure

Compute environment: Google Colab T4 GPU, Python 3
Epochs: 20
Batch size: 4 per device (gradient accumulation steps = 8)
Learning rate: 2e-5
Precision: fp16

Evaluation & Metrics

Evaluation dataset: 5% holdout from the custom FAQ dataset (~16 examples)
Metrics: BLEU, ROUGE, BERTScore
Results:
- BLEU: 0.0146
- ROUGE: rouge1=0.1083, rouge2=0.0281, rougeL=0.0816
- BERTScore (mean f1): 0.8211

Limitations & Biases

Known limitations: May hallucinate rare banking details; domain-restricted to the provided FAQ data.
Potential biases: Reflects biases present in original Llama and the customer-service samples.

License

This model is released under the MIT license. See LICENSE for details.

For questions or contributions, please open an issue on the model repo.

SardarTaimoor
/

llama3b-lora