Model Card for MR-Llama-3.1-8B-Instruct
Model Introduction
This model is the Mind Router of the paper: DynamicMind: A Tri-Mode Thinking System for Large Language Models.
It is a text classification model based on DeBERTaV3-base
that predicts the optimal thinking mode for an LLM to use when answering a given question. The goal is to dynamically balance reasoning accuracy and computational efficiency.
The model classifies questions into one of three Thinking Modes:
Fast: Delivers rapid, intuitive responses by limiting cognitive depth, prioritizing speed over thorough reasoning.
Normal: Leverages the LLM's native capabilities to balance response quality and efficiency.
Slow: Executes deep analytical reasoning for high-quality outputs with increased computational costs.
How to Use
Here is a runnable example of how to use the DynamicMind framework for inference.
1. Installation
First, clone this repository and install the necessary libraries.
git clone https://github.com/DL-Levi/DynamicMind
cd DynamicMind
pip install transformers==4.47.1
2. Running Inference
import os
import yaml
from transformers import pipeline
from modes.mode import ThinkingMode
def load_config(file_path):
if not os.path.exists(file_path):
raise FileNotFoundError(f"config file '{file_path}' not found")
with open(file_path, 'r') as file:
return yaml.safe_load(file)
def DynamicMind(prompt, llm_thinker, mind_router):
# Use the Mind Router to select the optimal thinking mode
router_pipeline = pipeline("text-classification", model=mind_router, trust_remote_code=True)(prompt)
thinking_mode = router_pipeline['label']
print(f"Mind Router selected: {thinking_mode} Mode")
mode_config = load_config(f"../config/{thinking_mode}_mode_config.yaml")
system_prompt = mode_config['mode']["system_prompt"]
format_prompt = mode_config['mode']['format_prompt'].format(answer_format='42')
# Generate a response with the LLM Thinker
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt + format_prompt},
]
thinker_pipeline = pipeline("text-generation", model=llm_thinker, **mode_config['generation'])(messages)
response = thinker_pipeline[0]['generated_text'][-1]['content']
return response
# --- Example Usage ---
prompt = "Please answer the following question: It is approximately 1955 kilometers from San Diego, California to New York City, New York. If Bernice drove 325 kilometers for 4 days, how many kilometers will she still need to drive?"
llm_thinker = "meta-llama/Meta-Llama-3.1-8B-Instruct"
mind_router = "0xWei/MR-Llama-3.1-8B-Instruct"
response = DynamicMind(prompt, llm_thinker, mind_router)
print(response)
Training Data
The Mind Router was trained on the Thinking Mode Capacity (TMC) dataset, which was created as part of our research. The TMC datasets are available on:
- TMC dataset for Llama: TMC-Llama-3.1-8B-Instruct
- TMC dataset for Qwen: TMC-Qwen-2.5-7B-Instruct
Citation
If you find our work helpful, please cite our paper:
@article{li2025dynamicmind,
title={DynamicMind: A Tri-Mode Thinking System for Large Language Models},
author={Li, Wei and Wei, Yanbin and Huang, Qiushi and Yan, Jiangyue and Chen, Yang and Kwok, James T and Zhang, Yu},
journal={arXiv preprint arXiv:2506.05936},
year={2025}
}
- Downloads last month
- 1
Model tree for 0xWei/MR-Llama-3.1-8B-Instruct
Base model
microsoft/deberta-v3-base