How much VRAM ?

#14
by tmanzz - opened

How much VRAN is meant to be used? I am seeing close to 10GB!
Any optimization recommendations?

I had a look into this https://huggingface.co/blog/sd3#using-sd3-with-diffusers - but I can't make it work. My GPU is T4 (16GB).

This is how my pipeline looks like:

            quantization_config = BitsAndBytesConfig(load_in_8bit=True)
            text_encoder = T5EncoderModel.from_pretrained(
                model_name,
                subfolder="text_encoder_3",
                quantization_config=quantization_config,
            )
            self.pipeline = StableDiffusion3Pipeline.from_pretrained(model_name, cache_dir="./cache", 
                                                                     text_encoder_3=text_encoder,
                                                                     device_map="balanced",
                                                                     torch_dtype=torch.float16
            )
            self.pipeline.enable_model_cpu_offload()

Any suggestions?

Error I get is e.g.:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 252.00 MiB. GPU

Works fine on my old GTX 1080 with 8 GB VRAM.

pipeline = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", text_encoder_3=None, tokenizer_3=None,  torch_dtype=torch.float16).to('cuda')

@KernelDebugger - that works! Genius! I wonder what the text_encoder and tokenizer do if they can be set to 'None'?

@KernelDebugger "Dropping the T5 Text Encoder" didn't work for me. I mix it with "Model Offloading", and it works for 8GB GTX 1070 Ti.

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", text_encoder_3=None, tokenizer_3=None, torch_dtype=torch.float16)
pipe.enable_model_cpu_offload()

@KernelDebugger - that works! Genius! I wonder what the text_encoder and tokenizer do if they can be set to 'None'?

They are optional. There is more lightweight encoder in this model, pipline falls back to using it when text_encoder_3=None

Sign up or log in to comment