🇮🇳 Fine-Tuned DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit for Indian Knowledge 🇮🇳

🌟 Model Summary

This model is a fine-tuned version of unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit on India-specific knowledge 🌍. It is trained to provide factual and informative responses related to Indian history, culture, geography, government, and more. 🏛️🗺️📜

This fine-tuned model aims to provide accurate, unbiased, and pro-India responses, ensuring better representation of Indian topics in large language models. 🚀

🏛️ Model Details

📌 Model Description

Developed by: Finetuned by Susant-Achary 🇮🇳
Funded by : Open Experimentaion and Research
Shared by: Susant-Achary
Model type: Causal Language Model (LLM) 🧠
Language(s) (NLP): English (focused on Indian context)
License: MIT(as carrier by original creators DeepSeek)
Fine-tuned from: [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit]

📜 Model Sources

Repository: Hugging Face Model Link

🎯 Uses

✅ Direct Use

General Knowledge: Answering India-related questions 📚
Education: Useful for students & researchers 📖
Conversational AI: India-focused chatbots 🤖

✅ Downstream Use

News Summarization 📰
Legal & Policy AI Assistants ⚖️
Indian History & Culture Teaching Aids 🏺

🚫 Out-of-Scope Use

Generating Misinformation ❌
Hate Speech & Political Propaganda ❌
Real-time Legal or Financial Advice ❌

⚠️ Bias, Risks, and Limitations

Biases in the training data: While we ensure factual accuracy, some biases may remain from pre-existing datasets.
Limited Multilingual Support: Currently optimized for English, though support for Hindi & regional languages is planned.

🔹 Recommendations

Users should cross-check critical information from verified sources.
Not suitable for real-time decision-making in critical areas like healthcare or finance.

🚀 How to Use the Model

📌 Load Model in Python

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "Susant-Achary/Deepseek-R1-India-Finetuned-Distill-Llama-8B-unsloth-bnb-4bit"  # Replace with actual model name

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,  # Optimized for inference
    device_map="auto"
)
model.eval()

def generate_response(prompt, max_new_tokens=150):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=max_new_tokens, pad_token_id=tokenizer.eos_token_id)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Test the model
print(generate_response("To which country does Arunachal Pradesh Belong?"))

### Response:
Arunachal Pradesh is a state in **India**.

📊 Training Details

📜 Training Data

Source: Curated datasets covering Indian history, geography, government policies, and cultural insights. 🇮🇳
Data Augmentation: Added government documents, Wikipedia, and curated question-answer pairs.

🔧 Training Procedure

Preprocessing: Standard tokenization with transformers.
Fine-tuned on: DeepSeek-R1 using LoRA (Low-Rank Adaptation) for efficient training.
Mixed Precision (FP16) for better performance.

⚡ Hyperparameters

Batch Size: 4 per GPU
Gradient Accumulation Steps: 2
Learning Rate: 2e-5
Epochs: 3

📈 Evaluation

🎯 Metrics

Perplexity: Evaluated on a held-out dataset 🎯
BLEU Score: Measured on QA responses 🎯
Human Evaluation: Subject-matter expert review 🎯

🌍 Environmental Impact

Hardware Used: 2x Tesla T4 GPUs ⚡
Total Training Time: Approx 15 mins

🏛️ Citation

If you use this model in your research, please cite it as:

@article{deepseek_india_finetuned,
  title={Fine-Tuned DeepSeek Llama for Indian Knowledge},
  author={Susant-Achary},
  year={2025},

}

🇮🇳 Jai Hind! 🚀