inference not working in any enviroment

by LoreVitCon - opened May 22

May 22

Tryed both instruct and instruct-cuda, in both cases i get some errors (like missing triton, but cant install it) or cuda version that requieres a GPU

bapatra

Microsoft org May 22

Hi ! Thank you for your interest in phi-3 !
For the small model, because we use a custom triton kernel for block-sparse attention, there is a dependency on having a GPU as well as on Triton.
There is active work going on for enabling llama.cpp support for this (see this issue).

bapatra changed discussion status to closed May 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment