merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using Qwen/Qwen2.5-14B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

### CONFIG SuperiorMerge-14B-From-2-to-10 ###

models:
  - model: VAGOsolutions/SauerkrautLM-v2-14b-DPO
    parameters:
      weight: 0.25    # Prioritize top IFEval
      density: 0.6     # Keep a large portion for strong factual baseline

  - model: allknowingroger/QwenSlerp6-14B
    parameters:
      weight: 0.25    # High weight for MATH and balanced reasoning
      density: 0.6     # Retain robust reasoning capabilities

  - model: CultriX/SeQwence-14B-EvolMerge
    parameters:
      weight: 0.20    # Important for best BBH and near-top MUSR
      density: 0.5     # Moderate density to ensure these strengths blend well

  - model: CultriX/Qwen2.5-14B-Wernicke
    parameters:
      weight: 0.15    # Adds top GPQA performance
      density: 0.5     # Sufficient to preserve QA strengths

  - model: allknowingroger/QwenStock3-14B
    parameters:
      weight: 0.15    # For top MMLU-PRO, enhancing domain knowledge
      density: 0.5     # Balanced integration of diverse subject expertise

base_model: Qwen/Qwen2.5-14B
merge_method: dare_ties
parameters:
  normalize: true      # Ensures parameter scaling compatibility
  int8_mask: true      # Memory and computational efficiency
dtype: bfloat16
tokenizer_source: Qwen/Qwen2.5-14B-Instruct

### END OF CONFIG SuperiorMerge-14B-From-2-to-10 ###
Downloads last month
800
Safetensors
Model size
14.8B params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CultriX/Qwen2.5-14B-Wernickev3

Space using CultriX/Qwen2.5-14B-Wernickev3 1