--- base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B license: mit pipeline_tag: question-answering tags: - medical - chain-of-thought - fine-tuning - qlora - unsloth --- # DeepSeek-R1-Medical-CoT 🚀 **Fine-tuning DeepSeek R1 for Medical Chain-of-Thought Reasoning** This model is a fine-tuned version of `deepseek-ai/DeepSeek-R1-Distill-Llama-8B`, specifically designed to enhance medical reasoning through **Chain-of-Thought (CoT) prompting**. It is trained using **QLoRA** with **Unsloth optimization**, allowing efficient fine-tuning on limited hardware resources. --- ## 📌 Model Details ### **Model Description** - **Developed by:** [Your Name or Organization] - **Fine-tuned from:** `deepseek-ai/DeepSeek-R1-Distill-Llama-8B` - **Language(s):** English, with a focus on medical terminology - **Training Data:** Medical reasoning dataset (`medical-o1-reasoning-SFT`) - **Fine-tuning Method:** QLoRA (4-bit adapters), later merged into 16-bit weights - **Optimization:** Unsloth (2x faster fine-tuning with lower memory usage) ### **Model Sources** - **Repository:** [Your Hugging Face Model Repo URL] - **Paper (if applicable):** [Link] - **Demo (if applicable):** [Link] --- ## 🛠 **How to Use the Model** ### 1️⃣ Load the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer repo_name = "your-huggingface-username/DeepSeek-R1-Medical-CoT" tokenizer = AutoTokenizer.from_pretrained(repo_name) model = AutoModelForCausalLM.from_pretrained(repo_name) model.eval() ``` ### 2️⃣ Run inference ```python prompt = "What are the early symptoms of diabetes?" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate(**inputs, max_new_tokens=200) response = tokenizer.decode(output[0], skip_special_tokens=True) print("Model Response:", response) ``` ### 📢 Acknowledgments - **DeepSeek-AI for releasing DeepSeek-R1** - **Unsloth for optimized LoRA fine-tuning** - **Hugging Face for hosting the models**