Please convert these models to GGUF format...

#12
by Moodym - opened

Please convert to GGUF format...

Others already have. Just do a search.

@bartowski has already quantz for this.
it may be helpful for the model card to include those refs.

Is quantz version still that smart against o1-mini?

there is degradation obviously by the change in precision. But there are no performance metrics to showcase exactly how much is lost on each quant..
IMHO a non-quant 32B can perform as good as a quant R1 full.. in any case the model in 32B is extraordinary good.

there is degradation obviously by the change in precision. But there are no performance metrics to showcase exactly how much is lost on each quant..
IMHO a non-quant 32B can perform as good as a quant R1 full.. in any case the model in 32B is extraordinary good.

To me its a lot like a wav->mp3 decision when you dont have a lot of storage on your music device. Sure, you will lose some quality but its smaller. So, its a trade off of necessity, and just have to decide the bit-rate you are willing to go down to.

Sign up or log in to comment