Qwen2.5-DeepHyper

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using Qwen/Qwen2.5-14B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

base_model: Qwen/Qwen2.5-14B
models:
  # Each adapter was extracted (rank=128) from its respective finetuned model.
  # Their weights are set lower than the full instruct model (which is now the base)
  - model: CultriX/Qwen2.5-14B-Hyperionv3_r128
    parameters:
      weight: 0.9  # Reduced weight relative to base
      density: 0.9

  - model: CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
    parameters:
      weight: 1.0
      density: 1.0

  - model: Qwen/Qwen2.5-14B-Instruct
    parameters:
      weight: 0.75
      density: 0.75

  - model: /root/.cache/huggingface/hub/models--CultriX--Qwen2.5-14B-DeepSeek_r128/snapshots/1bca847f92fced165076d9ac921a1e3ef01fcd7f/
    parameters:
      weight: 1.00
      density: 1.00

# Merging method and overall parameters
merge_method: dare_ties         # Ties corresponding weights across sources.
parameters:
  weight: 1.0                 # Overall scaling factor.
  density: 1.0                # Overall density (typically left at 1.0).
  normalize: true             # Normalize each set of weights before merging.
  int8_mask: true             # Enable masking if using int8 quantized weights.

# Use the instruct tokenizer to ensure compatibility.
tokenizer_source: CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128

# Data type for merged weights.
dtype: bfloat16

Downloads last month
19
Safetensors
Model size
14.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for CultriX/Qwen2.5-DeepHyper

Space using CultriX/Qwen2.5-DeepHyper 1