--- license: apache-2.0 base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - Qwen/Qwen2.5-14B-Instruct-1M pipeline_tag: text-generation library_name: transformers --- # Qwen2.5-14B-DeepSeek-R1-1M A merged model combines the reasoning model's strengths (Qwen2.5-14B-DeepSeek-R1) and the long-context model capabilities (Qwen2.5-14B-Instruct-1M) for versatile performance. # Merge config ```yaml models: - model: "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B" parameters: weight: 1 density: 1 merge_method: ties base_model: "Qwen/Qwen2.5-14B-Instruct-1M" parameters: density: 1 normalize: true int8_mask: true dtype: bfloat16 ``` and I needed to make some minor adjustments to the tokenizer configuration. # How to Use ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "mkurman/Qwen2.5-14B-DeepSeek-R1-1M" model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto") tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Write a Python script to merge two CSV files." messages = [ {"role": "system", "content": "You are an expert programmer."}, {"role": "user", "content": prompt} ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) outputs = model.generate(inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` You can use it on `LM Studio` or `Ollama` by utilizing the provided GGUF files. # License Apache 2.0 for open-source contribution and collaboration.