Model Card for MR-Llama-3.1-8B-Instruct

Model Introduction

This model is the Mind Router of the paper: DynamicMind: A Tri-Mode Thinking System for Large Language Models.

It is a text classification model based on DeBERTaV3-base that predicts the optimal thinking mode for an LLM to use when answering a given question. The goal is to dynamically balance reasoning accuracy and computational efficiency.

The model classifies questions into one of three Thinking Modes:

Fast: Delivers rapid, intuitive responses by limiting cognitive depth, prioritizing speed over thorough reasoning.
Normal: Leverages the LLM's native capabilities to balance response quality and efficiency.
Slow: Executes deep analytical reasoning for high-quality outputs with increased computational costs.

How to Use

Here is a runnable example of how to use the DynamicMind framework for inference.

1. Installation

First, clone this repository and install the necessary libraries.

git clone https://github.com/DL-Levi/DynamicMind
cd DynamicMind
pip install transformers==4.47.1

2. Running Inference

import os
import yaml
from transformers import pipeline
from modes.mode import ThinkingMode

def load_config(file_path):
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"config file '{file_path}' not found")
    with open(file_path, 'r') as file:
        return yaml.safe_load(file)

def DynamicMind(prompt, llm_thinker, mind_router):
    # Use the Mind Router to select the optimal thinking mode
    router_pipeline = pipeline("text-classification", model=mind_router, trust_remote_code=True)(prompt)
    thinking_mode = router_pipeline['label']
    print(f"Mind Router selected: {thinking_mode} Mode")
    mode_config = load_config(f"../config/{thinking_mode}_mode_config.yaml")

    system_prompt = mode_config['mode']["system_prompt"]
    format_prompt = mode_config['mode']['format_prompt'].format(answer_format='42')

    # Generate a response with the LLM Thinker
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt + format_prompt},
    ]
    thinker_pipeline = pipeline("text-generation", model=llm_thinker, **mode_config['generation'])(messages)
    response = thinker_pipeline[0]['generated_text'][-1]['content']

    return response

# --- Example Usage ---
prompt = "Please answer the following question: It is approximately 1955 kilometers from San Diego, California to New York City, New York. If Bernice drove 325 kilometers for 4 days, how many kilometers will she still need to drive?"
llm_thinker = "meta-llama/Meta-Llama-3.1-8B-Instruct"
mind_router = "0xWei/MR-Llama-3.1-8B-Instruct"

response = DynamicMind(prompt, llm_thinker, mind_router)
print(response)

Training Data

The Mind Router was trained on the Thinking Mode Capacity (TMC) dataset, which was created as part of our research. The TMC datasets are available on:

TMC dataset for Llama: TMC-Llama-3.1-8B-Instruct
TMC dataset for Qwen: TMC-Qwen-2.5-7B-Instruct

Citation

If you find our work helpful, please cite our paper:

@article{li2025dynamicmind,
  title={DynamicMind: A Tri-Mode Thinking System for Large Language Models},
  author={Li, Wei and Wei, Yanbin and Huang, Qiushi and Yan, Jiangyue and Chen, Yang and Kwok, James T and Zhang, Yu},
  journal={arXiv preprint arXiv:2506.05936},
  year={2025}
}

0xWei
/

MR-Llama-3.1-8B-Instruct