Which model is responsible for naming of the thread?

#402
by qnixsynapse - opened

image.png

Here the main model is Gemma 8B 1.1, obviously got wrong but the thread name made me ROFL! 🤣 Thought of asking here.

Well Mixtral is responsible for this.

From here

HF Chat has a Mistral 7B model setup with system prompt for the task of summarizing the first chat prompt/msg into a title for the chat history log, so unless one explicitly addresses that in the first msg it is what it is ig. and we can always rename it. Still, i think it would have been awesome if we could customize the naming style/prompt it ourselves.

So yeah, the first prompt goes to 2 models

@SvCy Do you know the reason for that

Hugging Chat org

@SvCy Do you know the reason for that

Mistral 7B is a lightweight model that follows instructions pretty well, so we use it for all our "tasks" (generating the web search query, finding a title for conversation, etc.)

Mistral 7B is a lightweight model that follows instructions pretty well, so we use it for all our "tasks" (generating the web search query, finding a title for conversation, etc.)

Ook, thanks

Yes. Mistral 7B is an amazing model. It's best to prompt the model like this(assuming it is the instruct model): "[INST] Create a topic name from the following message[s]: {messages} [/INST]".

Something like this(I used Gemma 1.1 in this case and honestly I am finding it amazing!):
image.png

deleted
edited Apr 19

Mistral 7B is a lightweight model that follows instructions pretty well, so we use it for all our "tasks" (generating the web search query, finding a title for conversation, etc.)

I suggest switching to Dolphin-Mistral-7B. Its cross-language ability, generation quality and breadth of knowledge are amazing.

@qnixsynapse and @Mindires
Task Model is changed to Llama 3 70B.
And Naming of Thread is also improved by @nsarrazin
Check it out!

Sign up or log in to comment