DeepSeek-R1-Medical-CoT

🚀 Fine-tuning DeepSeek R1 for Medical Chain-of-Thought Reasoning

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Llama-8B, specifically designed to enhance medical reasoning through Chain-of-Thought (CoT) prompting. It is trained using QLoRA with Unsloth optimization, allowing efficient fine-tuning on limited hardware resources.


📌 Model Details

Model Description

  • Developed by: [Your Name or Organization]
  • Fine-tuned from: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
  • Language(s): English, with a focus on medical terminology
  • Training Data: Medical reasoning dataset (medical-o1-reasoning-SFT)
  • Fine-tuning Method: QLoRA (4-bit adapters), later merged into 16-bit weights
  • Optimization: Unsloth (2x faster fine-tuning with lower memory usage)

Model Sources

  • Repository: [Your Hugging Face Model Repo URL]
  • Paper (if applicable): [Link]
  • Demo (if applicable): [Link]

🛠 How to Use the Model

1️⃣ Load the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_name = "your-huggingface-username/DeepSeek-R1-Medical-CoT"

tokenizer = AutoTokenizer.from_pretrained(repo_name)
model = AutoModelForCausalLM.from_pretrained(repo_name)

model.eval()

2️⃣ Run inference

prompt = "What are the early symptoms of diabetes?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=200)

response = tokenizer.decode(output[0], skip_special_tokens=True)
print("Model Response:", response)

📢 Acknowledgments

  • DeepSeek-AI for releasing DeepSeek-R1
  • Unsloth for optimized LoRA fine-tuning
  • Hugging Face for hosting the models
Downloads last month
73
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for zijiechen156/DeepSeek-R1-Medical-CoT

Finetuned
(54)
this model