The paper suggests 4-bit quantization, will Microsoft release the quantized version?
There are quantized models here:https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnxhttps://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnxhttps://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf
· Sign up or log in to comment