base_model: microsoft/Phi-3-mini-4k-instruct library_name: peft

Model Card for LoRA Fine-Tuned Phi-3-mini-4k

This model is a LoRA (Low-Rank Adaptation) fine-tuned version of the microsoft/Phi-3-mini-4k-instruct model, using 4-bit quantization for efficient training and inference. The model is trained and tested for instruction-following tasks and can be easily saved and shared via the Hugging Face Hub.


Model Details

Model Description

This model adapts the Phi-3-mini-4k-instruct LLM using LoRA, a parameter-efficient fine-tuning technique, and 4-bit quantization for reduced memory usage. It is suitable for a variety of NLP tasks, especially where resource efficiency is important.

  • Developed by: Amit Chaubey
  • Funded by : N/A
  • Shared by : Amit Chaubey
  • Model type: Causal Language Model (LLM), LoRA fine-tuned, 4-bit quantized
  • Language(s) (NLP): English
  • License: MIT (or the license of the base model, if different)
  • Finetuned from model : microsoft/Phi-3-mini-4k-instruct

Model Sources


Uses

Direct Use

  • Text generation
  • Instruction following
  • Conversational AI
  • Educational and research purposes

Out-of-Scope Use

  • Not suitable for real-time safety-critical applications
  • Not intended for generating harmful, biased, or misleading content

Bias, Risks, and Limitations

  • The model may reflect biases present in the training data.
  • Not suitable for sensitive or high-stakes decision-making.
  • Outputs should be reviewed by humans before use in production.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. Always validate outputs for your use case.


How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "microsoft/Phi-3-mini-4k-instruct"
lora_adapter = "path/to/lora_adapter"

model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, lora_adapter)
tokenizer = AutoTokenizer.from_pretrained(model)

prompt = "The Weather is good today"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

  • Used the Hugging Face sweatSmile/sarcastic-dataset dataset for demonstration. Replace with your own dataset as needed.

Training Procedure

  • LoRA rank: 8
  • LoRA alpha: 16
  • LoRA dropout: 0.05
  • Target modules: ["query_key_value", "o_proj", "qkv_proj", "gate_up_proj", "down_proj"]
  • 4-bit quantization (nf4)
  • Training regime: bf16 mixed precision

Speeds, Sizes, Times [optional]

  • Model size: ~3.8B parameters (0.33% trainable)
  • Training time: Varies by hardware (e.g., 30min to 1 hour on A100 GPU for small datasets(1000))

Evaluation

Testing Data, Factors & Metrics

  • Used a held-out portion of the training dataset for evaluation.
  • Metrics: Perplexity, qualitative review of generated outputs.

Results

  • The model demonstrates strong instruction-following and text generation capabilities after LoRA fine-tuning.

Environmental Impact

  • Hardware Type: Google L4
  • Hours used: ~20 min
  • Cloud Provider: Google
  • Compute Region: UK
  • Carbon Emitted: Estimate using ML CO2 Impact calculator

Technical Specifications

Model Architecture and Objective

  • Base: Phi-3-mini-4k-instruct (Causal LM)
  • LoRA fine-tuning with PEFT
  • 4-bit quantization for efficiency

Compute Infrastructure

  • Hardware: NVIDIA A100 GPU (or similar - L4)
  • Software: Python 3.10+, PyTorch, Transformers, PEFT, Datasets

Citation [optional]

BibTeX:

@article{hu2021lora,
  title={LoRA: Low-Rank Adaptation of Large Language Models},
  author={Hu, Edward J. and others},
  journal={arXiv preprint arXiv:2106.09685},
  year={2021}
}

Model Card Authors

  • Amit Chaubey

Model Card Contact

  • sweatSmile

Framework versions

  • PEFT 0.15.2
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sweatSmile/ak-phi3-mini-sarcasm-adapter

Adapter
(751)
this model

Dataset used to train sweatSmile/ak-phi3-mini-sarcasm-adapter