Problems with using this

by flowwasnthere - opened Jul 22

Discussion

flowwasnthere

Jul 22

Hello!

This model looks very promising and I would love to use it, but I'm having issues.

I'm using it with Ollama and OpenWebUI. I'm new to running LLMs locally, so bear with me here haha.

In my prompt, I provided a bunch of loose ideas and some guidance for what should happen in the script, and said to help me write it. The output was all over the place. It would alternate between lowercase and uppercase letters a lot, sometimes even in the middle of words, KiNd oF lIkE tHiS. And parts like "CUT TO:" and "INT. BATHROOM - LATER" and "INT. ABANDONED BASEMENT - NIGHT" would be in places that make zero sense, like in the middle of dialogue it would cut back and forth between scenes and locations when it's not supposed to. Then at the end it spammed "<|end_of_text|>" over and over and started talking about "effective ways to promote a podcast without spending a ton of money" for some reason. Super incoherent, basically.

I'm sure this is because I've set it up wrong or that I need to change some settings to get it working properly. Everything is on default right now, I've attached a screenshot of the settings that can be changed. Would appreciate it a ton if you could help me out here!

Should also mention that I'm using the GGUF file that's 17.6 GB. Wasn't sure which one to pick so I just used the biggest one in hopes of higher quality.

DavidAU

Owner Jul 23

Hi:

Please note you need: llama3 template to use this (issue with "repeat" seems to point to an issue here).

Also:
Set "temp" to .4 , TOP_K to 40, and REPEAT PEN to 1.1 to start.
And turn FREQ PEN to OFF.
I don't know the defaults for Ollama, but I do know these settings are critical for a starting point.

RE: Quant to use -> Q4 or IQ4 are great, (or higher).

flowwasnthere

Jul 24

Thanks for the response!

Yes, I saw the part about the standard Llama3 template on the model page, but I didn't find any clear place to put it in Ollama/OpenWebUI. It also seems like some settings are missing or aren't exposed, like Repeat Penalty for example.

The way I've got things set up may not be ideal. I'm certainly open to trying something other than Ollama for this. Do you have any recommendation for me? How are you running it yourself?

DavidAU

Owner Jul 24

I run it in LmStudio [ lmstudio.ai ] , you can also use Text Generation UI (github) - https://github.com/oobabooga/text-generation-webui .
The latter is a bit more complex, however you have a tonne of setting options.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment