|
--- |
|
base_model: unsloth/phi-4-unsloth-bnb-4bit |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- gguf |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- bespokelabs/Bespoke-Stratos-17k |
|
- bespokelabs/Bespoke-Stratos-35k |
|
- NovaSky-AI/Sky-T1_data_17k |
|
- Quazim0t0/BenfordsLawReasoningJSON |
|
- open-thoughts/OpenThoughts-114k |
|
--- |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** Quazim0t0 |
|
- **Finetuned from model :** unsloth/phi-4-unsloth-bnb-4bit |
|
- **GGUF** |
|
- **Trained for 8 Hours on A800 with the Bespoke Stratos 17k Dataset.** |
|
- **Trained for 6 Hours on A800 with the Bespoke Stratos 35k Dataset.** |
|
- **Trained for 2 Hours on A800 with the Benford's Law Reasoning Small 430 Row Dataset, ensuring no overfitting.** |
|
- **Trained for 4 Hours on A800 with the Sky-T1_data_17k Dataset** |
|
- **Trained for 2 Hours on A800 with the Openthoughts 114k Dataset.** |
|
- **15$ Training...I'm actually amazed by the results.** |
|
|
|
|
|
If using this model for Open WebUI here is a simple function to organize the models responses: https://openwebui.com/f/quaz93/phi4_turn_r1_distill_thought_function_v1 |
|
|
|
# Phi4 Turn R1Distill LoRA Adapters |
|
|
|
## Overview |
|
These **LoRA adapters** were trained using diverse **reasoning datasets** that incorporate structured **Thought** and **Solution** responses to enhance logical inference. This project was designed to **test the R1 dataset** on **Phi-4**, aiming to create a **lightweight, fast, and efficient reasoning model**. |
|
|
|
All adapters were fine-tuned using an **NVIDIA A800 GPU**, ensuring high performance and compatibility for continued training, merging, or direct deployment. |
|
As part of an open-source initiative, all resources are made **publicly available** for unrestricted research and development. |
|
|
|
--- |
|
|
|
## LoRA Adapters |
|
Below are the currently available LoRA fine-tuned adapters (**as of January 30, 2025**): |
|
|
|
- [Phi4.Turn.R1Distill-Lora1](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora1) |
|
- [Phi4.Turn.R1Distill-Lora2](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora2) |
|
- [Phi4.Turn.R1Distill-Lora3](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora3) |
|
- [Phi4.Turn.R1Distill-Lora4](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora4) |
|
- [Phi4.Turn.R1Distill-Lora5](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora5) |
|
- [Phi4.Turn.R1Distill-Lora6](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora6) |
|
- [Phi4.Turn.R1Distill-Lora7](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora7) |
|
- [Phi4.Turn.R1Distill-Lora8](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill-Lora8) |
|
|
|
--- |
|
|
|
## GGUF Full & Quantized Models |
|
To facilitate broader testing and real-world inference, **GGUF Full and Quantized versions** have been provided for evaluation on **Open WebUI** and other LLM interfaces. |
|
|
|
### **Version 1** |
|
- [Phi4.Turn.R1Distill.Q8_0](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill.Q8_0) |
|
- [Phi4.Turn.R1Distill.Q4_k](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill.Q4_k) |
|
- [Phi4.Turn.R1Distill.16bit](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill.16bit) |
|
|
|
### **Version 1.1** |
|
- [Phi4.Turn.R1Distill_v1.1_Q4_k](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.1_Q4_k) |
|
|
|
### **Version 1.2** |
|
- [Phi4.Turn.R1Distill_v1.2_Q4_k](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.2_Q4_k) |
|
|
|
### **Version 1.3** |
|
- [Phi4.Turn.R1Distill_v1.3_Q4_k-GGUF](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.3_Q4_k-GGUF) |
|
|
|
### **Version 1.4** |
|
- [Phi4.Turn.R1Distill_v1.4_Q4_k-GGUF](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.4_Q4_k-GGUF) |
|
|
|
### **Version 1.5** |
|
- [Phi4.Turn.R1Distill_v1.5_Q4_k-GGUF](https://huggingface.co/Quazim0t0/Phi4.Turn.R1Distill_v1.5_Q4_k-GGUF) |
|
|
|
--- |
|
|
|
## Usage |
|
|
|
### **Loading LoRA Adapters with `transformers` and `peft`** |
|
To load and apply the LoRA adapters on Phi-4, use the following approach: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
|
|
base_model = "microsoft/Phi-4" |
|
lora_adapter = "Quazim0t0/Phi4.Turn.R1Distill-Lora1" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
model = AutoModelForCausalLM.from_pretrained(base_model) |
|
model = PeftModel.from_pretrained(model, lora_adapter) |
|
|
|
model.eval() |
|
|