Spaces:
Running
Love the UI, would love it even more if we could use it with locally deployed models
Maybe this is possible and I just didn't figure it out yet, but reading through the docs and especially the .env and the README, it appears that the MODEL_ENDPOINTS much point to an endpoint hosted on HF.
Since I love the clean UI from a user perspective, I'd appreciate some pointers on how to run chat-ui completely local with local models (:
Hi @nojster
If you want to use local models you can run https://github.com/huggingface/text-generation-inference locally and for the endpoint change to http://127.0.0.1:8080/generate_stream
for example
Awesome! Your quick reply is much appreciated @coyotte508 !
Will give this a go (:
Hi
@coyotte508
and
@nojster
,
I tried to apply your conversation, but I am getting error of:
I set: MODEL_ENDPOINTS=[{ "endpoint": "http://127.0.0.1:8086/generate_stream", #"authorization": "Bearer hf_<token>", "weight": 1 }]
and from the text-generation-inference, I am running it locally by: text-generation-launcher --model-id bigscience/bloom-560m --num-shard 2 --port 8086
I also tried to check my backend responses from text-generation-inference by typing: curl 127.0.0.1:8086/generate_stream -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":5}}' -H 'Content-Type: application/json'
the output of text-generation-launcher is:
and the output of curl command is:
I highly appreciate your support.
you should set MODEL_ENDPOINTS=[{ "endpoint": "http://127.0.0.1:8086/generate_stream", "weight": 1 }]
@coyotte508
Thanks a lot for your feedback. I just think I am missing something in your reply, since I see no difference in the way I defined MODEL_ENDPOINTS.
I just commented "authorization": "Bearer hf_", which in your response it is deleted. I guess the result would be the same?
anyway I entered your code but I am having same results :(
Best
Because you can't add #
in the middle of a definition in your .env.local
By the way, the code was updated & the configuration simplified:
git pull
npm install
- Then in your
.env.local
, add :
MODELS=[{"name":"mymodel","endpoints":[{"url":"http://127.0.0.1:8086/generate_stream"}]}]
It should work hopefully
worked now! Thanks :)
would love it if you could share screenshots of your local instances of chat-ui
, BTW! Or more details about what you're building using it ๐ช
I am always getting this error "You dont have access to this conversation" 403 when I am using my local chat-ui with open assistant. Not sure why I am getting 403 error. I configured an access token with write permissions in my .env.local file.
I am getting the same error. "You dont have access to this conversation" 403 when I am using my local chat-ui with open assistant. I didn't setup a access token. running in k8s
https://huggingface.co/docs/text-generation-inference/basic_tutorials/consuming_tgi mentioned the MODELS
variable but not MODEL_ENDPOINTS
.
https://huggingface.co/docs/text-generation-inference/basic_tutorials/consuming_tgi mentioned the MODELS
variable but not MODEL_ENDPOINTS
.