MPWARE
/

DeepSeek-R1-Distill-Qwen-7B-BnB-4bits

4-bit precision

Model card Files Files and versions Community

DeepSeek-R1-Distill-Qwen-7B-BnB-4bits / README.md

MPWARE's picture

Update README.md

5c2eb83 verified 18 days ago

|

history blame contribute delete

644 Bytes

	---
	license: mit
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	---
	BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc

	Tested successfully with vLLM 0.7.2 with the following parameters:

	```python
	llm_model = LLM(
	"MPWARE/DeepSeek-R1-Distill-Qwen-7B-BnB-4bits",
	task="generate",
	dtype=torch.bfloat16,
	max_num_seqs=8192,
	max_model_len=8192,
	trust_remote_code=True,
	quantization="bitsandbytes",
	load_format="bitsandbytes",
	enforce_eager=True, # Required for vLLM architecture V1
	tensor_parallel_size=1,
	gpu_memory_utilization=0.95,
	seed=42
	)
	```