File size: 11,014 Bytes

---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-14B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- CoT
- Convsersational
- text-generation-inference
model-index:
- name: QwQ-LCoT-14B-Conversational
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: wis-k/instruction-following-eval
      split: train
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 40.47
      name: averaged accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FQwQ-LCoT-14B-Conversational
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: SaylorTwift/bbh
      split: test
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 45.63
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FQwQ-LCoT-14B-Conversational
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: lighteval/MATH-Hard
      split: test
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 31.42
      name: exact match
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FQwQ-LCoT-14B-Conversational
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      split: train
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 13.31
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FQwQ-LCoT-14B-Conversational
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 20.62
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FQwQ-LCoT-14B-Conversational
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 47.54
      name: accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FQwQ-LCoT-14B-Conversational
      name: Open LLM Leaderboard
---
# **QwQ-LCoT-14B-Conversational**

QwQ-LCoT-14B-Conversational is a highly advanced AI model built upon the foundation of the Qwen 2.5 14B Instruct model, further refined and fine-tuned to handle complex, chain-of-thought-based long conversational scenarios. This fine-tuning enables the model to excel in tasks that require step-by-step reasoning, detailed explanations, and nuanced understanding of intricate topics. By leveraging its robust architecture and advanced training, QwQ-LCoT-14B-Conversational is optimized for use cases that demand precision, depth, and adaptability in dialogue.


This makes it particularly effective for applications such as long-form discussions, detailed problem-solving, and multi-step reasoning processes, allowing it to cater to a broad range of complex and versatile use cases. Its ability to maintain coherent and meaningful conversations over extended contexts positions it as an ideal choice for scenarios requiring thoughtful and dynamic interaction.


| Rank | Type | Model                             | Average | IFEval | BBH   | MATH  | GPQA  | MUSR  | MMLU  | CO₂ C  | Dated |
|------|------|-----------------------------------|---------|--------|-------|-------|-------|-------|-------|--------|----------|
| 323  | 🔶   | [prithiv/MLmods/QwQ-LCoT-14B-Conversational](#) | 33.17   | 40.47  | 45.63 | 31.42 | 13.31 | 20.62 | 47.54 | 1.95 |01/20/2025|

## **Key Features**

### **Enhanced Knowledge and Capabilities**
- **Coding and Mathematics**: Significantly improved performance in coding and mathematical tasks, thanks to specialized expert models in these domains.

### **Advanced Instruction Following**
- **Instruction Following**: Enhanced ability to follow instructions accurately, even for complex tasks.
- **Long Text Generation**: Capable of generating long texts exceeding 8,000 tokens.
- **Structured Data Understanding**: Improved understanding of structured data such as tables.
- **JSON Generation**: Exceptional ability to generate structured outputs, including JSON.

### **Resilient and Versatile**
- **Prompt Diversity**: Greater resilience to diverse system prompts, enhancing role-play scenarios and condition-setting for chatbots.

### **Long-Context Support**
- **Context Length**: Supports up to 128,000 tokens, with the ability to generate up to 8,000 tokens in a single response.

## **Quickstart**

Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/QwQ-LCoT-14B-Conversational"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

### **Multilingual Support**

QwQ-LCoT-14B-Conversational offers robust multilingual support, enabling seamless communication and interaction across over 29 languages. This includes widely spoken languages such as Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic, among others. Its ability to understand and generate text in multiple languages makes it an ideal choice for global applications, including multilingual customer support, cross-cultural communication, and localized content creation. Whether engaging in dialogue, solving problems, or generating content, the model ensures high-quality performance across diverse linguistic contexts, catering to a broad and international user base.

## **Applications**

QwQ-LCoT-14B-Conversational is ideal for:
- Long-form conversational AI
- Complex reasoning and chain-of-thought explanations
- Multilingual communication
- Structured data generation and processing
- Enhanced role-play and chatbot implementation

## **Intended Use**

1. **Long-Form Dialogue Systems**: QwQ-LCoT-14B-Conversational is designed for creating conversational agents capable of engaging in extended, context-rich dialogues, making it suitable for applications like customer support, virtual assistants, and interactive storytelling.

2. **Complex Reasoning Tasks**: The model excels at tasks requiring step-by-step reasoning, such as solving mathematical problems, coding challenges, and logical puzzles.

3. **Multilingual Communication**: With support for over 29 languages, the model is ideal for global applications, including multilingual customer service, translation, and cross-cultural communication.

4. **Structured Data Processing**: The model’s ability to understand and generate structured data (e.g., tables, JSON) makes it useful for data analysis, report generation, and API integration.

5. **Content Generation**: It can generate high-quality, long-form content, including articles, essays, and technical documentation, across various domains and languages.

6. **Role-Play and Chatbots**: The model’s resilience to diverse system prompts enhances its ability to simulate characters, role-play scenarios, and implement dynamic chatbot interactions.

## **Limitations**

1. **Performance Variability Across Languages**: While the model supports multiple languages, its performance may vary depending on the language, with better results for languages more prevalent in its training data.

2. **Handling of Niche Topics**: The model may struggle to provide accurate information or generate high-quality content for highly specialized or niche topics not covered extensively in its training data.

3. **Complex Multi-Step Reasoning**: Although optimized for reasoning tasks, the model may occasionally produce incorrect or incomplete results for highly complex or ambiguous problems.

4. **Bias and Ethical Concerns**: As with any large language model, QwQ-LCoT-14B-Conversational may inherit biases present in its training data, leading to potential ethical concerns or inappropriate outputs in certain contexts.

5. **Context Limitations**: Despite its large context window, the model may still face challenges in maintaining coherence and relevance for extremely long or dense inputs.

6. **Resource Intensive**: As a large-scale model with 14 billion parameters, it requires substantial computational resources for both inference and deployment, limiting its use in resource-constrained environments.

7. **Instruction Ambiguity**: The model’s performance can degrade when instructions are ambiguous, vague, or conflicting, potentially leading to outputs that do not align with user expectations.

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/prithivMLmods__QwQ-LCoT-14B-Conversational-details)!
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=prithivMLmods%2FQwQ-LCoT-14B-Conversational&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!

|      Metric       |Value (%)|
|-------------------|--------:|
|**Average**        |    33.16|
|IFEval (0-Shot)    |    40.47|
|BBH (3-Shot)       |    45.63|
|MATH Lvl 5 (4-Shot)|    31.42|
|GPQA (0-shot)      |    13.31|
|MuSR (0-shot)      |    20.62|
|MMLU-PRO (5-shot)  |    47.54|