How to load on a single A100 40GB

#18
by mnwato - opened

Hi. Anyone knows about the memory usage? Is there a way to load on a single A100 40GB?

Please use huggingface dev version 4.30-dev (downloaded from pip github) & accelerate 0.20-dev (from github too)

Then please use bitsandbytes package for using bfloat16, load_in_4bit, and quant_type=nf4.

mnwato changed discussion status to closed

Sign up or log in to comment