Phi-3.5-mini-instruct-onnx model for DirectML.

by Charlie-In-TW - opened 2 days ago

2 days ago

Hi,

Will you release the Phi-3.5-mini-instruct-onnx model for DirectML (Like Phi-3-mini-4k-instruct-onnx)? Or any suggestion how to convert Phi-3.5-mini-instruct-onnx to DirectML version?

Charlie-In-TW

2 days ago

Hi, I found your previous discussion about this:

--
With the newly uploaded INT4 AWQ models, there is now one optimized ONNX model for CPU and one optimized ONNX model for GPU (e.g. CUDA, DirectML). Here is a tutorial you can follow to create your own INT4 AWQ ONNX models.

For INT8 precision, you can create the FP32 ONNX model using ONNX Runtime GenAI's model builder and then use ONNX Runtime's INT8 quantization tools.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment