|
--- |
|
base_model: |
|
- allura-org/Qwen2.5-32b-RP-Ink |
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-32B |
|
- Aryanne/QwentileSwap |
|
- Daemontatox/Cogito-Ultima |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
# Qwetiapin |
|
|
|
> There's no 'I' in 'brain damage' |
|
|
|
![](https://files.catbox.moe/9k5p1v.png) |
|
|
|
### Overview |
|
|
|
An attempt to make QwentileSwap write even better by merging it with RP-Ink. And DeepSeek, because why not. However, I screwed up the first merge step by accidentally setting an extremely high epsilon value. Step2 wasn't planned, but due to a wonky tensor size mismatch error, I couldn't merge Step1 into QwentileSwap using sce, so I just threw in some random model. And that did, in fact, solve the issue. |
|
|
|
The result? Well, it's usable, I guess. The slop is reduced, more details are brought up, but said details sometimes get messed up. It's fixed by a few swipes and there's a chance that it's caused by my sampler settings, but uhh I'll just leave them as they are. |
|
|
|
Prompt format: ChatML |
|
|
|
Settings: [This kinda works but I'm weird](https://files.catbox.moe/hmw87j.json) |
|
|
|
### Quants |
|
|
|
[Static](https://huggingface.co/mradermacher/Q2.5-Qwetiapin-32B-GGUF) | [Imatrix](https://huggingface.co/mradermacher/Q2.5-Qwetiapin-32B-i1-GGUF) |
|
|
|
## Merge Details |
|
### Merging Steps |
|
|
|
### Step1 |
|
|
|
```yaml |
|
dtype: bfloat16 |
|
tokenizer_source: base |
|
merge_method: della_linear |
|
parameters: |
|
density: 0.5 |
|
epsilon: 0.4 #was supposed to be 0.04 |
|
lambda: 1.1 |
|
base_model: allura-org/Qwen2.5-32b-RP-Ink |
|
models: |
|
- model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B |
|
parameters: |
|
weight: |
|
- filter: v_proj |
|
value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0] |
|
- filter: o_proj |
|
value: [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1] |
|
- filter: up_proj |
|
value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] |
|
- filter: gate_proj |
|
value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0] |
|
- filter: down_proj |
|
value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] |
|
- value: 0 |
|
- model: allura-org/Qwen2.5-32b-RP-Ink |
|
parameters: |
|
weight: |
|
- filter: v_proj |
|
value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1] |
|
- filter: o_proj |
|
value: [0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0] |
|
- filter: up_proj |
|
value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] |
|
- filter: gate_proj |
|
value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1] |
|
- filter: down_proj |
|
value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] |
|
- value: 1 |
|
``` |
|
|
|
### Step2 |
|
|
|
```yaml |
|
models: |
|
- model: Aryanne/QwentileSwap |
|
parameters: |
|
weight: [1.0, 0.9, 0.8, 0.9, 1.0] |
|
- model: Daemontatox/Cogito-Ultima |
|
parameters: |
|
weight: [0, 0.1, 0.2, 0.1, 0] |
|
merge_method: nuslerp |
|
parameters: |
|
nuslerp_row_wise: true |
|
dtype: bfloat16 |
|
tokenizer_source: base |
|
``` |
|
|
|
### Step3 |
|
|
|
```yaml |
|
models: |
|
- model: Step2 |
|
- model: Step1 |
|
merge_method: sce |
|
base_model: Step2 |
|
parameters: |
|
select_topk: |
|
- value: [0.3, 0.35, 0.4, 0.35, 0.2] |
|
dtype: bfloat16 |
|
``` |