This repo includes a Version of Phi-3 that was quantized to AWQ using AutoAWQ. Currently hosting via the TGI docker image fails due to its fallback on AutoModel and that not being compatible with AWQ. Hosting on vLLM is recommended.

To run the model you need to set the trust-remote-code (or similar) flag. While the remote code comes from microsoft (see LICENSE information in the file) you should validate the code yourself before deployment.

Downloads last month: 237

Safetensors

Model size

682M params

Tensor type

I32

FP16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.