Cannot Run `unsloth/DeepSeek-R1-GGUF` Model – Missing `configuration_deepseek.py`

#32
by syrys4750 - opened

I am trying to load the unsloth/DeepSeek-R1-GGUF model using AutoModelForCausalLM from the transformers library, but I keep running into this error:

OSError: unsloth/DeepSeek-R1-GGUF does not appear to have a file named configuration_deepseek.py.

I have already checked the Files and Versions section in the model repo, and there is a config.json, but using it still results in the same error.

Here’s my code:

config = AutoConfig.from_pretrained(hf_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(hf_model, trust_remote_code=True, token=HF_TOKEN, config=config)

The error suggests that transformers is looking for a configuration_deepseek.py file that does not exist.

I am aware that GGUF models are typically designed for llama.cpp, but I would like to know:

  1. Is there a way to load this model in transformers despite the missing configuration_deepseek.py?
  2. If not, is there an equivalent HF model (non-GGUF) that I should be using instead?

Would appreciate any insights. Thanks in advance.

I found some chinese dude who hacked together a repo including the additional files to run a small unsloth quants with ktransformers

https://huggingface.co/is210379/DeepSeek-R1-UD-IQ1_S/discussions/1#67af73c8fc64848c6031148d

Might be able to make a HF repo just to hold the extra files, and serve the big GGUF's locally?

Not sure how to tell k/transformers to look for the files in a local dir..

yeah got it working, just download all the .py and .json files out of that sketchy repo and stick them into the directory with your good unsloth GGUF's e.g.

$ ls /mnt/raid/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-UD-Q2_K_XL/
config.json                                 DeepSeek-R1-UD-Q2_K_XL-00002-of-00005.gguf  DeepSeek-R1-UD-Q2_K_XL-00005-of-00005.gguf  tokenizer.json
configuration_deepseek.py                   DeepSeek-R1-UD-Q2_K_XL-00003-of-00005.gguf  generation_config.json
DeepSeek-R1-UD-Q2_K_XL-00001-of-00005.gguf  DeepSeek-R1-UD-Q2_K_XL-00004-of-00005.gguf  tokenizer_config.json

Then I loaded ktransformers like so and got faster generation than llama.cpp possibly with FA enabled too?? (not sure of context lengths or anything, just got it going). It might fix your transformers issue too if you translate the model path stuff all right:

I have a quick guide I'm working on in an issue over there: https://github.com/kvcache-ai/ktransformers/issues/186#issuecomment-2659894815

Sign up or log in to comment