DeepSeek Chatbot
This is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, optimized for conversational AI applications. The model maintains the base model's capabilities while being tuned for improved dialogue interactions.
Model Details
Model Description
- Developed by: Trinoid
- Model type: Conversational Language Model
- Language(s): English
- License: Same as base model (DeepSeek-R1-Distill-Qwen-1.5B)
- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Uses
Direct Use
This model can be used for:
- General conversation
- Text generation
- Question answering
- Chat-based applications
Example usage:
from huggingface_hub import InferenceClient
client = InferenceClient("Trinoid/Deepseek_Chatbot")
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, how are you?"}
]
response = client.chat_completion(
messages,
max_tokens=512,
temperature=0.7,
top_p=0.95
)
Out-of-Scope Use
This model should not be used for:
- Generation of harmful or malicious content
- Spreading misinformation
- Production of illegal content
- Making critical decisions without human oversight
Training Details
Training Procedure
Training Hyperparameters
- Training regime: fp16 mixed precision
- Framework: PEFT (Parameter-Efficient Fine-Tuning)
- PEFT Method: LoRA
- Version: PEFT 0.14.0
Technical Specifications
Model Architecture and Objective
- Base architecture: DeepSeek-R1-Distill-Qwen-1.5B
- Fine-tuning method: PEFT/LoRA
- Primary objective: Conversational AI
Compute Infrastructure
Software
- PEFT 0.14.0
- Transformers
- Python 3.x
Model Card Contact
For questions or issues about this model, please open an issue in the model repository.
- Downloads last month
- 35
Model tree for Trinoid/Deepseek_Chatbot
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B