Is there any inference server which can support Phi-3-vision-128K-instruct?
#49
by
farzanehnakhaee70
- opened
Is there any inference server like Ollama or TGI which can support this model?
Maybe sglang can serve it. It supprots llava-next, so I think a little bit of modification can serve phi3-vision.
Oh, also vllm now supportes phi3-vision too. You can see the issue here. https://github.com/vllm-project/vllm/pull/4986
You should install vllm from source.
@farzanehnakhaee70 we have support in mistral.rs with multi batch, in situ quantization, and Python, OpenAI, and other APIs: https://github.com/EricLBuehler/mistral.rs/blob/master/docs%2FPHI3V.md