Love the UI, would love it even more if we could use it with locally deployed models

#108
by nojster - opened

Maybe this is possible and I just didn't figure it out yet, but reading through the docs and especially the .env and the README, it appears that the MODEL_ENDPOINTS much point to an endpoint hosted on HF.

Since I love the clean UI from a user perspective, I'd appreciate some pointers on how to run chat-ui completely local with local models (:

Hugging Chat org

Hi @nojster

If you want to use local models you can run https://github.com/huggingface/text-generation-inference locally and for the endpoint change to http://127.0.0.1:8080/generate_stream for example

Awesome! Your quick reply is much appreciated @coyotte508 !

Will give this a go (:

Hi @coyotte508 and @nojster ,
I tried to apply your conversation, but I am getting error of:

image.png

I set: MODEL_ENDPOINTS=[{ "endpoint": "http://127.0.0.1:8086/generate_stream", #"authorization": "Bearer hf_<token>", "weight": 1 }]

and from the text-generation-inference, I am running it locally by: text-generation-launcher --model-id bigscience/bloom-560m --num-shard 2 --port 8086
I also tried to check my backend responses from text-generation-inference by typing: curl 127.0.0.1:8086/generate_stream -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":5}}' -H 'Content-Type: application/json'

the output of text-generation-launcher is:

print.PNG

and the output of curl command is:

image.png

I highly appreciate your support.

Hugging Chat org

you should set MODEL_ENDPOINTS=[{ "endpoint": "http://127.0.0.1:8086/generate_stream", "weight": 1 }]

@coyotte508 Thanks a lot for your feedback. I just think I am missing something in your reply, since I see no difference in the way I defined MODEL_ENDPOINTS.
I just commented "authorization": "Bearer hf_", which in your response it is deleted. I guess the result would be the same?
anyway I entered your code but I am having same results :(

Best

Hugging Chat org

Because you can't add # in the middle of a definition in your .env.local

By the way, the code was updated & the configuration simplified:

  • git pull
  • npm install
  • Then in your .env.local, add :
MODELS=[{"name":"mymodel","endpoints":[{"url":"http://127.0.0.1:8086/generate_stream"}]}]

It should work hopefully

worked now! Thanks :)

Hugging Chat org

would love it if you could share screenshots of your local instances of chat-ui, BTW! Or more details about what you're building using it ๐Ÿ’ช

I am always getting this error "You dont have access to this conversation" 403 when I am using my local chat-ui with open assistant. Not sure why I am getting 403 error. I configured an access token with write permissions in my .env.local file.

I am getting the same error. "You dont have access to this conversation" 403 when I am using my local chat-ui with open assistant. I didn't setup a access token. running in k8s

https://huggingface.co/docs/text-generation-inference/basic_tutorials/consuming_tgi mentioned the MODELS variable but not MODEL_ENDPOINTS.

https://huggingface.co/docs/text-generation-inference/basic_tutorials/consuming_tgi mentioned the MODELS variable but not MODEL_ENDPOINTS.

Sign up or log in to comment