DeepSeek Chatbot

This is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, optimized for conversational AI applications. The model maintains the base model's capabilities while being tuned for improved dialogue interactions.

Model Details

Model Description

  • Developed by: Trinoid
  • Model type: Conversational Language Model
  • Language(s): English
  • License: Same as base model (DeepSeek-R1-Distill-Qwen-1.5B)
  • Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Uses

Direct Use

This model can be used for:

  • General conversation
  • Text generation
  • Question answering
  • Chat-based applications

Example usage:

from huggingface_hub import InferenceClient

client = InferenceClient("Trinoid/Deepseek_Chatbot")

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"}
]

response = client.chat_completion(
    messages,
    max_tokens=512,
    temperature=0.7,
    top_p=0.95
)

Out-of-Scope Use

This model should not be used for:

  • Generation of harmful or malicious content
  • Spreading misinformation
  • Production of illegal content
  • Making critical decisions without human oversight

Training Details

Training Procedure

Training Hyperparameters

  • Training regime: fp16 mixed precision
  • Framework: PEFT (Parameter-Efficient Fine-Tuning)
  • PEFT Method: LoRA
  • Version: PEFT 0.14.0

Technical Specifications

Model Architecture and Objective

  • Base architecture: DeepSeek-R1-Distill-Qwen-1.5B
  • Fine-tuning method: PEFT/LoRA
  • Primary objective: Conversational AI

Compute Infrastructure

Software

  • PEFT 0.14.0
  • Transformers
  • Python 3.x

Model Card Contact

For questions or issues about this model, please open an issue in the model repository.

Downloads last month
35
Safetensors
Model size
1.78B params
Tensor type
F32
·
Inference Providers NEW
Input a message to start chatting with Trinoid/Deepseek_Chatbot.

Model tree for Trinoid/Deepseek_Chatbot

Adapter
(37)
this model

Space using Trinoid/Deepseek_Chatbot 1