Edit model card

4 bits quantized GGUF weight of phi-3-mini-4k-instruct. Mlx compatible.

Official model: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf (not supported by mlx)

Please note that the official phi-3-mini-4k-instruct.gguf model is of llama-2 architecture as stated in the paper (https://huggingface.co/papers/2404.14219)

Downloads last month
215
GGUF
Model size
3.82B params
Architecture
llama

4-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.