|
--- |
|
license: mit |
|
language: |
|
- en |
|
base_model: |
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
|
--- |
|
BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc |
|
|
|
Tested successfully with vLLM 0.7.2 with the following parameters: |
|
|
|
```python |
|
llm_model = LLM( |
|
"MPWARE/DeepSeek-R1-Distill-Qwen-7B-BnB-4bits", |
|
task="generate", |
|
dtype=torch.bfloat16, |
|
max_num_seqs=8192, |
|
max_model_len=8192, |
|
trust_remote_code=True, |
|
quantization="bitsandbytes", |
|
load_format="bitsandbytes", |
|
enforce_eager=True, # Required for vLLM architecture V1 |
|
tensor_parallel_size=1, |
|
gpu_memory_utilization=0.95, |
|
seed=42 |
|
) |
|
``` |