DeepSeek Chatbot

This is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, optimized for conversational AI applications. The model maintains the base model's capabilities while being tuned for improved dialogue interactions.

Model Details

Model Description

Developed by: Trinoid
Model type: Conversational Language Model
Language(s): English
License: Same as base model (DeepSeek-R1-Distill-Qwen-1.5B)
Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Uses

Direct Use

This model can be used for:

General conversation
Text generation
Question answering
Chat-based applications

Example usage:

from huggingface_hub import InferenceClient

client = InferenceClient("Trinoid/Deepseek_Chatbot")

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"}
]

response = client.chat_completion(
    messages,
    max_tokens=512,
    temperature=0.7,
    top_p=0.95
)

Out-of-Scope Use

This model should not be used for:

Generation of harmful or malicious content
Spreading misinformation
Production of illegal content
Making critical decisions without human oversight

Training Details

Training Procedure

Training Hyperparameters

Training regime: fp16 mixed precision
Framework: PEFT (Parameter-Efficient Fine-Tuning)
PEFT Method: LoRA
Version: PEFT 0.14.0

Technical Specifications

Model Architecture and Objective

Base architecture: DeepSeek-R1-Distill-Qwen-1.5B
Fine-tuning method: PEFT/LoRA
Primary objective: Conversational AI

Compute Infrastructure

Software

PEFT 0.14.0
Transformers
Python 3.x

Model Card Contact

For questions or issues about this model, please open an issue in the model repository.

Trinoid
/

Deepseek_Chatbot