Cannot Run `unsloth/DeepSeek-R1-GGUF` Model – Missing `configuration_deepseek.py`
I am trying to load the unsloth/DeepSeek-R1-GGUF
model using AutoModelForCausalLM
from the transformers
library, but I keep running into this error:
OSError: unsloth/DeepSeek-R1-GGUF does not appear to have a file named configuration_deepseek.py.
I have already checked the Files and Versions section in the model repo, and there is a config.json
, but using it still results in the same error.
Here’s my code:
config = AutoConfig.from_pretrained(hf_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(hf_model, trust_remote_code=True, token=HF_TOKEN, config=config)
The error suggests that transformers
is looking for a configuration_deepseek.py
file that does not exist.
I am aware that GGUF models are typically designed for llama.cpp
, but I would like to know:
- Is there a way to load this model in
transformers
despite the missingconfiguration_deepseek.py
? - If not, is there an equivalent HF model (non-GGUF) that I should be using instead?
Would appreciate any insights. Thanks in advance.
I found some chinese dude who hacked together a repo including the additional files to run a small unsloth quants with ktransformers
Might be able to make a HF repo just to hold the extra files, and serve the big GGUF's locally?
Not sure how to tell k/transformers to look for the files in a local dir..
yeah got it working, just download all the .py
and .json
files out of that sketchy repo and stick them into the directory with your good unsloth GGUF's e.g.
$ ls /mnt/raid/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-UD-Q2_K_XL/
config.json DeepSeek-R1-UD-Q2_K_XL-00002-of-00005.gguf DeepSeek-R1-UD-Q2_K_XL-00005-of-00005.gguf tokenizer.json
configuration_deepseek.py DeepSeek-R1-UD-Q2_K_XL-00003-of-00005.gguf generation_config.json
DeepSeek-R1-UD-Q2_K_XL-00001-of-00005.gguf DeepSeek-R1-UD-Q2_K_XL-00004-of-00005.gguf tokenizer_config.json
Then I loaded ktransformers like so and got faster generation than llama.cpp possibly with FA enabled too?? (not sure of context lengths or anything, just got it going). It might fix your transformers issue too if you translate the model path stuff all right:
I have a quick guide I'm working on in an issue over there: https://github.com/kvcache-ai/ktransformers/issues/186#issuecomment-2659894815