Qwen2.5-14B-DeepSeek-R1-1M

This model is a merged pre-trained language model created using MergeKit with the TIES merge method. It uses Qwen/Qwen2.5-14B-Instruct-1M as the base and combines deepseek-ai/DeepSeek-R1-Distill-Qwen-14B and Qwen/Qwen2.5-14B-Instruct with equal weight and density. The merge configuration includes normalization, int8 masking, and bfloat16 precision for optimized performance.

Merge

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the TIES merge method using Qwen/Qwen2.5-14B-Instruct-1M as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
  - model: Qwen/Qwen2.5-14B-Instruct
    parameters:
      weight: 1
      density: 1
merge_method: ties
base_model: Qwen/Qwen2.5-14B-Instruct-1M
parameters:
  weight: 1
  density: 1
  normalize: true
  int8_mask: true
dtype: bfloat16
Downloads last month
1,203
Safetensors
Model size
14.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M

Space using prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M 1