🇮🇳 Fine-Tuned DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit for Indian Knowledge 🇮🇳

🌟 Model Summary

This model is a fine-tuned version of unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit on India-specific knowledge 🌍. It is trained to provide factual and informative responses related to Indian history, culture, geography, government, and more. 🏛️🗺️📜

This fine-tuned model aims to provide accurate, unbiased, and pro-India responses, ensuring better representation of Indian topics in large language models. 🚀

🏛️ Model Details

📌 Model Description

  • Developed by: Finetuned by Susant-Achary 🇮🇳
  • Funded by : Open Experimentaion and Research
  • Shared by: Susant-Achary
  • Model type: Causal Language Model (LLM) 🧠
  • Language(s) (NLP): English (focused on Indian context)
  • License: MIT(as carrier by original creators DeepSeek)
  • Fine-tuned from: [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit]

📜 Model Sources

🎯 Uses

✅ Direct Use

  • General Knowledge: Answering India-related questions 📚
  • Education: Useful for students & researchers 📖
  • Conversational AI: India-focused chatbots 🤖

✅ Downstream Use

  • News Summarization 📰
  • Legal & Policy AI Assistants ⚖️
  • Indian History & Culture Teaching Aids 🏺

🚫 Out-of-Scope Use

  • Generating Misinformation
  • Hate Speech & Political Propaganda
  • Real-time Legal or Financial Advice

⚠️ Bias, Risks, and Limitations

  • Biases in the training data: While we ensure factual accuracy, some biases may remain from pre-existing datasets.
  • Limited Multilingual Support: Currently optimized for English, though support for Hindi & regional languages is planned.

🔹 Recommendations

  • Users should cross-check critical information from verified sources.
  • Not suitable for real-time decision-making in critical areas like healthcare or finance.

🚀 How to Use the Model

📌 Load Model in Python

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "Susant-Achary/Deepseek-R1-India-Finetuned-Distill-Llama-8B-unsloth-bnb-4bit"  # Replace with actual model name

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,  # Optimized for inference
    device_map="auto"
)
model.eval()

def generate_response(prompt, max_new_tokens=150):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=max_new_tokens, pad_token_id=tokenizer.eos_token_id)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Test the model
print(generate_response("To which country does Arunachal Pradesh Belong?"))

### Response:
Arunachal Pradesh is a state in **India**.

📊 Training Details

📜 Training Data

  • Source: Curated datasets covering Indian history, geography, government policies, and cultural insights. 🇮🇳
  • Data Augmentation: Added government documents, Wikipedia, and curated question-answer pairs.

🔧 Training Procedure

  • Preprocessing: Standard tokenization with transformers.
  • Fine-tuned on: DeepSeek-R1 using LoRA (Low-Rank Adaptation) for efficient training.
  • Mixed Precision (FP16) for better performance.

⚡ Hyperparameters

  • Batch Size: 4 per GPU
  • Gradient Accumulation Steps: 2
  • Learning Rate: 2e-5
  • Epochs: 3

📈 Evaluation

🎯 Metrics

  • Perplexity: Evaluated on a held-out dataset 🎯
  • BLEU Score: Measured on QA responses 🎯
  • Human Evaluation: Subject-matter expert review 🎯

🌍 Environmental Impact

  • Hardware Used: 2x Tesla T4 GPUs
  • Total Training Time: Approx 15 mins

🏛️ Citation

If you use this model in your research, please cite it as:

@article{deepseek_india_finetuned,
  title={Fine-Tuned DeepSeek Llama for Indian Knowledge},
  author={Susant-Achary},
  year={2025},

}

🇮🇳 Jai Hind! 🚀

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.