🇮🇳 Fine-Tuned DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit for Indian Knowledge 🇮🇳
🌟 Model Summary
This model is a fine-tuned version of unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit on India-specific knowledge 🌍. It is trained to provide factual and informative responses related to Indian history, culture, geography, government, and more. 🏛️🗺️📜
This fine-tuned model aims to provide accurate, unbiased, and pro-India responses, ensuring better representation of Indian topics in large language models. 🚀
🏛️ Model Details
📌 Model Description
- Developed by: Finetuned by Susant-Achary 🇮🇳
- Funded by : Open Experimentaion and Research
- Shared by: Susant-Achary
- Model type: Causal Language Model (LLM) 🧠
- Language(s) (NLP): English (focused on Indian context)
- License: MIT(as carrier by original creators DeepSeek)
- Fine-tuned from: [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit]
📜 Model Sources
- Repository: Hugging Face Model Link
🎯 Uses
✅ Direct Use
- General Knowledge: Answering India-related questions 📚
- Education: Useful for students & researchers 📖
- Conversational AI: India-focused chatbots 🤖
✅ Downstream Use
- News Summarization 📰
- Legal & Policy AI Assistants ⚖️
- Indian History & Culture Teaching Aids 🏺
🚫 Out-of-Scope Use
- Generating Misinformation ❌
- Hate Speech & Political Propaganda ❌
- Real-time Legal or Financial Advice ❌
⚠️ Bias, Risks, and Limitations
- Biases in the training data: While we ensure factual accuracy, some biases may remain from pre-existing datasets.
- Limited Multilingual Support: Currently optimized for English, though support for Hindi & regional languages is planned.
🔹 Recommendations
- Users should cross-check critical information from verified sources.
- Not suitable for real-time decision-making in critical areas like healthcare or finance.
🚀 How to Use the Model
📌 Load Model in Python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "Susant-Achary/Deepseek-R1-India-Finetuned-Distill-Llama-8B-unsloth-bnb-4bit" # Replace with actual model name
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.float16, # Optimized for inference
device_map="auto"
)
model.eval()
def generate_response(prompt, max_new_tokens=150):
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=max_new_tokens, pad_token_id=tokenizer.eos_token_id)
return tokenizer.decode(output[0], skip_special_tokens=True)
# Test the model
print(generate_response("To which country does Arunachal Pradesh Belong?"))
### Response:
Arunachal Pradesh is a state in **India**.
📊 Training Details
📜 Training Data
- Source: Curated datasets covering Indian history, geography, government policies, and cultural insights. 🇮🇳
- Data Augmentation: Added government documents, Wikipedia, and curated question-answer pairs.
🔧 Training Procedure
- Preprocessing: Standard tokenization with
transformers
. - Fine-tuned on: DeepSeek-R1 using LoRA (Low-Rank Adaptation) for efficient training.
- Mixed Precision (FP16) for better performance.
⚡ Hyperparameters
- Batch Size: 4 per GPU
- Gradient Accumulation Steps: 2
- Learning Rate: 2e-5
- Epochs: 3
📈 Evaluation
🎯 Metrics
- Perplexity: Evaluated on a held-out dataset 🎯
- BLEU Score: Measured on QA responses 🎯
- Human Evaluation: Subject-matter expert review 🎯
🌍 Environmental Impact
- Hardware Used: 2x Tesla T4 GPUs ⚡
- Total Training Time: Approx 15 mins
🏛️ Citation
If you use this model in your research, please cite it as:
@article{deepseek_india_finetuned,
title={Fine-Tuned DeepSeek Llama for Indian Knowledge},
author={Susant-Achary},
year={2025},
}
🇮🇳 Jai Hind! 🚀
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model’s pipeline type.