CultriX
/

Qwen2.5-14B-Brocav7

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Qwen2.5-14B-Brocav7 / mergekit_config.yml

CultriX's picture

Upload folder using huggingface_hub

06acee7 verified 21 days ago

history blame contribute delete

2.41 kB

	merge_method: della_linear # Consider if the 'della_linear' merge method is optimal based on the latest benchmarks. Would a different method better balance task priorities?
	base_model: CultriX/Qwen2.5-14B-Wernickev3
	dtype: bfloat16
	parameters:
	epsilon: 0.01 # Epsilon is set for fine-grain precision. Consider validating this epsilon value against current benchmark results to ensure it effectively fine-tunes model behaviors and aligns with the intended benchmarks.
	lambda: 1.5 # Optimized for significant model contributions. Double-check if this value prioritizes the right model strengths effectively.
	normalize: true
	smoothing_factor: 0.08 # Double-check if this smoothing factor sufficiently supports task-specific performance across benchmarks, especially with diverse model contributions. Balanced blending to preserve model diversity.

	gradient_clipping:
	CultriX/Qwen2.5-14B-Wernickev3: 0.85
	CultriX/Qwenfinity-2.5-14B: 0.82
	djuna/Q2.5-Veltha-14B-0.5: 0.92
	CultriX/Qwen2.5-14B-Broca: 0.86
	qingy2019/Qwen2.5-Math-14B-Instruct: 0.94
	CultriX/SeQwence-14Bv1: 0.87
	sometimesanotion/Qwen2.5-14B-Vimarckoso: 0.90
	allknowingroger/QwenSlerp6-14B: 0.86

	models:
	- model: CultriX/Qwen2.5-14B-Wernickev3
	parameters:
	weight: 0.25
	density: 0.72
	- model: CultriX/Qwenfinity-2.5-14B
	parameters:
	weight: 0.22
	density: 0.68
	- model: djuna/Q2.5-Veltha-14B-0.5
	parameters:
	weight: 0.20
	density: 0.75
	- model: CultriX/Qwen2.5-14B-Broca
	parameters:
	weight: 0.16
	density: 0.68
	- model: qingy2019/Qwen2.5-Math-14B-Instruct
	parameters:
	weight: 0.19
	density: 0.75
	- model: CultriX/SeQwence-14Bv1
	parameters:
	weight: 0.13
	density: 0.65
	- model: sometimesanotion/Qwen2.5-14B-Vimarckoso
	parameters:
	weight: 0.11
	density: 0.62
	- model: allknowingroger/QwenSlerp6-14B
	parameters:
	weight: 0.09
	density: 0.65

	adaptive_merge_parameters:
	task_weights:
	tinyArc: 1.65
	tinyHellaswag: 1.55
	tinyMMLU: 1.7
	tinyTruthfulQA: 1.95
	tinyTruthfulQA_mc1: 1.75
	tinyWinogrande: 1.8
	IFEval: 2.0
	BBH: 1.75
	MATH: 2.2
	GPQA: 1.85
	MUSR: 1.95
	MMLU-PRO: 1.85 # Consider reviewing task_weights to confirm alignment with latest performance data and benchmarks for optimal configuration.

	tokenizer_source: CultriX/Qwen2.5-14B-Wernickev3