Issue with --n-gpu-layers 5 Parameter: Model Only Running on CPU

#10

by vuk123 - opened about 12 hours ago

about 12 hours ago

Hi, I’m facing an issue where the --n-gpu-layers 5 parameter doesn’t seem to work. Despite having 2x NVIDIA A6000 GPUs, the model runs entirely on the CPU, with no GPU utilization. Has anyone else encountered this, or is there a fix for it?

this is how i run model : llama-cli --model /home/user/mymodels/DeepSeek-V3-Q3_K_M/DeepSeek-V3-Q3_K_M-00001-of-00007.gguf --cache-type-k q5_0 --threads 16 --prompt '<｜User｜>What is 1+1?<｜Assistant｜>' --n-gpu-layers 5

vuk123

about 12 hours ago

it look like problem is i installed llama.cpp with brew so its not compiled by cuda...

vuk123

about 5 hours ago

i build it with cmake, now it works...

shimmyshimmer

Unsloth AI org about 5 hours ago

i build it with cmake, now it works...

glad you got it working!!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment