Bad results with ggml version of your model

by appliedstuff - opened Jul 28, 2023

Jul 28, 2023

Hello,

do you have any example prompts and responses of your model. With a GGML version from the Bloke, the model performs very bad. See here: https://huggingface.co/TheBloke/llama-2-13B-German-Assistant-v2-GGML/discussions

So, I am interested in what you think about the cause of this?
Is it the GGML conversion that makes it that bad or is the model itself that cause that it has such a worse performance?

flozi00

Owner Jul 28, 2023

Thebloke and me knowing about the worse performance
Since I have never ever worked with ggml I have no experience how to fix that

appliedstuff

Jul 30, 2023

•

edited Jul 30, 2023

Ok that mean the results on the full model are better? Maybe you can provide some examples in the model card? That would be great! Would make it a lot easier to decide if I use it without deploying it on a GPU server. Thanks in advance!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment