MPWARE
/

DeepSeek-R1-Distill-Qwen-7B-BnB-4bits

4-bit precision

Model card Files Files and versions Community

MPWARE commited on 18 days ago

Commit

5c2eb83

·

verified ·

1 Parent(s): 994c9df

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -6,3 +6,22 @@ base_model:
 - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 ---
 BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc

 - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 ---
 BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc
+Tested successfully with vLLM 0.7.2 with the following parameters:
+```python
+llm_model = LLM(
+	"MPWARE/DeepSeek-R1-Distill-Qwen-7B-BnB-4bits",
+	task="generate",
+	dtype=torch.bfloat16,
+	max_num_seqs=8192,
+	max_model_len=8192,
+	trust_remote_code=True,
+	quantization="bitsandbytes",
+	load_format="bitsandbytes",
+	enforce_eager=True, # Required for vLLM architecture V1
+	tensor_parallel_size=1,
+	gpu_memory_utilization=0.95,
+	seed=42
+)
+```