DeepSeek-R1-Medical-CoT
🚀 Fine-tuning DeepSeek R1 for Medical Chain-of-Thought Reasoning
This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Llama-8B
, specifically designed to enhance medical reasoning through Chain-of-Thought (CoT) prompting. It is trained using QLoRA with Unsloth optimization, allowing efficient fine-tuning on limited hardware resources.
📌 Model Details
Model Description
- Developed by: [Your Name or Organization]
- Fine-tuned from:
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
- Language(s): English, with a focus on medical terminology
- Training Data: Medical reasoning dataset (
medical-o1-reasoning-SFT
) - Fine-tuning Method: QLoRA (4-bit adapters), later merged into 16-bit weights
- Optimization: Unsloth (2x faster fine-tuning with lower memory usage)
Model Sources
- Repository: [Your Hugging Face Model Repo URL]
- Paper (if applicable): [Link]
- Demo (if applicable): [Link]
🛠 How to Use the Model
1️⃣ Load the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
repo_name = "your-huggingface-username/DeepSeek-R1-Medical-CoT"
tokenizer = AutoTokenizer.from_pretrained(repo_name)
model = AutoModelForCausalLM.from_pretrained(repo_name)
model.eval()
2️⃣ Run inference
prompt = "What are the early symptoms of diabetes?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print("Model Response:", response)
📢 Acknowledgments
- DeepSeek-AI for releasing DeepSeek-R1
- Unsloth for optimized LoRA fine-tuning
- Hugging Face for hosting the models
- Downloads last month
- 73
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for zijiechen156/DeepSeek-R1-Medical-CoT
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B