Issue with --n-gpu-layers 5 Parameter: Model Only Running on CPU
#10
by
vuk123
- opened
Hi, I’m facing an issue where the --n-gpu-layers 5 parameter doesn’t seem to work. Despite having 2x NVIDIA A6000 GPUs, the model runs entirely on the CPU, with no GPU utilization. Has anyone else encountered this, or is there a fix for it?
this is how i run model : llama-cli --model /home/user/mymodels/DeepSeek-V3-Q3_K_M/DeepSeek-V3-Q3_K_M-00001-of-00007.gguf --cache-type-k q5_0 --threads 16 --prompt '<|User|>What is 1+1?<|Assistant|>' --n-gpu-layers 5
it look like problem is i installed llama.cpp with brew so its not compiled by cuda...
i build it with cmake, now it works...
i build it with cmake, now it works...
glad you got it working!!