Deepseek_Chatbot / README.md
Trinoid's picture
Update README.md
e3016d1 verified
metadata
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
library_name: peft
pipeline_tag: text-generation
language: en
tags:
  - deepseek
  - text-generation
  - conversational

DeepSeek Chatbot

This is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, optimized for conversational AI applications. The model maintains the base model's capabilities while being tuned for improved dialogue interactions.

Model Details

Model Description

  • Developed by: Trinoid
  • Model type: Conversational Language Model
  • Language(s): English
  • License: Same as base model (DeepSeek-R1-Distill-Qwen-1.5B)
  • Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Uses

Direct Use

This model can be used for:

  • General conversation
  • Text generation
  • Question answering
  • Chat-based applications

Example usage:

from huggingface_hub import InferenceClient

client = InferenceClient("Trinoid/Deepseek_Chatbot")

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"}
]

response = client.chat_completion(
    messages,
    max_tokens=512,
    temperature=0.7,
    top_p=0.95
)

Out-of-Scope Use

This model should not be used for:

  • Generation of harmful or malicious content
  • Spreading misinformation
  • Production of illegal content
  • Making critical decisions without human oversight

Training Details

Training Procedure

Training Hyperparameters

  • Training regime: fp16 mixed precision
  • Framework: PEFT (Parameter-Efficient Fine-Tuning)
  • PEFT Method: LoRA
  • Version: PEFT 0.14.0

Technical Specifications

Model Architecture and Objective

  • Base architecture: DeepSeek-R1-Distill-Qwen-1.5B
  • Fine-tuning method: PEFT/LoRA
  • Primary objective: Conversational AI

Compute Infrastructure

Software

  • PEFT 0.14.0
  • Transformers
  • Python 3.x

Model Card Contact

For questions or issues about this model, please open an issue in the model repository.