DeepSeek-R1-Distill-Qwen-1.5B Fine-Tuned on GSM8K with Chain-of-Thought Augmentation

Model Overview

This model is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, trained on the OpenAI GSM8K dataset, augmented with Chain-of-Thought (CoT) reasoning using DeepSeek-V3. The fine-tuning process enhances the modelโ€™s mathematical problem-solving abilities, allowing it to provide step-by-step solutions with deeper reasoning.

๐Ÿ”น Key Features

  • Base Model: DeepSeek-R1-Distill-Qwen-1.5B
  • Fine-Tuned On: GSM8K dataset with DeepSeek-V3-enhanced reasoning
  • Improved Mathematical Reasoning: Generates detailed step-by-step CoT explanations
  • Optimized for GRPO Training: Trained using trl and unsloth for efficient fine-tuning

๐Ÿ“Š Dataset & Training Details

  • Dataset: eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
    • 8K train samples, 1K test samples
    • Contains question, answer, and CoT reasoning
  • Training Methodology:
    • Used Guided Reinforcement Policy Optimization (GRPO) via trl
    • Applied gradient accumulation to manage larger batch sizes
    • Integrated DeepSeek-V3 augmentation for enhanced logical reasoning
  • Fine-tuning Tools:
    • Unsloth for memory-efficient Llama-based tuning
    • Hugging Face Transformers for model training

For those interested in replicating the fine-tuning process, I have shared an updated Colab notebook ๐Ÿ““:
๐Ÿ”— Colab Notebook

You will need:
โœ… Hugging Face Token
โœ… Together.AI API Key
โœ… Unsloth Package


๐Ÿš€ How to Run the Model (Mac via llama.cpp)

Yes! You can run this model locally on macOS using llama.cpp.

1๏ธโƒฃ Install Homebrew (If Not Installed)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Then add Homebrew to your PATH:

echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"

2๏ธโƒฃ Install llama.cpp

brew install llama.cpp

3๏ธโƒฃ Run the Model with llama-cli

llama-cli -hf eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced-gguf:Q8_0

4๏ธโƒฃ Alternative: Run Locally via GGUF

mkdir -p ~/llama_models && cd ~/llama_models
wget https://huggingface.co/eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced-gguf/resolve/main/Q8_0.gguf
llama-cli -m ~/llama_models/Q8_0.gguf --interactive

๐Ÿ“Œ How to Use Model via Python (transformers)

You can load the model with Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "A farmer has 24 apples. He gives 6 to each of his 3 children. How many does he have left?"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_length=200)
print(tokenizer.decode(output[0], skip_special_tokens=True))

๐Ÿ”ฌ Expected Performance

Compared to the base DeepSeek-R1-Distill-Qwen-1.5B, this fine-tuned model:

  • Provides more detailed Chain-of-Thought (CoT) explanations for GSM8K problems.
  • Improves logical reasoning and step-by-step answer formulation.
  • Generates clearer, more structured solutions, making it ideal for educational use.

๐Ÿ—‚ Model Hosting & License

๐Ÿ“Œ Model on Hugging Face Hub:
๐Ÿ‘‰ eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced

๐Ÿ“œ License: MIT License โ€“ Open for modification and distribution.


If you have feedback or ideas for improvement, feel free to reach out! ๐Ÿš€๐Ÿ”ฅ

#AI #MachineLearning #DeepSeek #GSM8K #LLM #ChainOfThought #HuggingFace #GRPO #Reasoning ```

Downloads last month
431
Safetensors
Model size
3.09B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4

Quantized
(170)
this model

Dataset used to train eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4