MPWARE commited on
Commit
5c2eb83
·
verified ·
1 Parent(s): 994c9df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -6,3 +6,22 @@ base_model:
6
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
7
  ---
8
  BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
7
  ---
8
  BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc
9
+
10
+ Tested successfully with vLLM 0.7.2 with the following parameters:
11
+
12
+ ```python
13
+ llm_model = LLM(
14
+ "MPWARE/DeepSeek-R1-Distill-Qwen-7B-BnB-4bits",
15
+ task="generate",
16
+ dtype=torch.bfloat16,
17
+ max_num_seqs=8192,
18
+ max_model_len=8192,
19
+ trust_remote_code=True,
20
+ quantization="bitsandbytes",
21
+ load_format="bitsandbytes",
22
+ enforce_eager=True, # Required for vLLM architecture V1
23
+ tensor_parallel_size=1,
24
+ gpu_memory_utilization=0.95,
25
+ seed=42
26
+ )
27
+ ```