p1atdev
/

CogView4-6B-quanto_int8

Model card Files Files and versions Community

Quantization settings

vae.: torch.bfloat16. No quantization.
text_encoder.layers.:
- Int8 with Optimum Quanto
- Target layers:["q_proj", "k_proj", "v_proj", "o_proj", "mlp.down_proj", "mlp.gate_up_proj"]
diffusion_model.:
- Int8 with Optimum Quanto
- Target layers: ["to_q", "to_k", "to_v", "to_out.0", "ff.net.0.proj", "ff.net.2"]

VRAM cosumption

Text encoder (text_encoder.): about 11 GB
Denoiser (diffusion_model.): about 10 GB

Samples

`torch.bfloat16`	Quanto Int8

VRAM 40GB (without offloading)	VRAM 28GB (without offloading)

Generation parameters

prompt: """ A photo of a nendoroid figure of hatsune miku holding a sign that says "CogView4" """"
negative_prompt: "blurry, low quality, horror"
height: 1152
width: 1152
cfg_scale: 3.5
num_inference_steps: 20

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for p1atdev/CogView4-6B-quanto_int8

Base model

THUDM/glm-4-9b

Finetuned

THUDM/CogView4-6B

Quantized

(1)

this model