Gemma-2 is a huge step up over previous Google OS models - short feedback

#22
by Dampfinchen - opened

I just wanted to say I really like the direction Google is taking Gemma. It's actually very competitive and in some aspects better than other open source models from Meta and Mistral, which was far from being the case before.

It doesn't refuse reasonable requests and it demonstrates high intelligence in its respective sizes. Also, good at instruct following and has a nice writing style.

What can be improved is definately the context size - 8K with SWA is really not much these days. And it also has trouble with some formatting styles for roleplaying (markdown/asteriks, quotes etc), but that could be because of the llama.cpp implementation. It's veeery slow, much slower than expected for the size and compared to LLama 8B (both without FA), I assume this has something to do with logit soft capping and the high VRAM usage due to half GQA instead of full GQA. Finally, for future models I would also like to see support for system prompts.

Thank you for your nice work, Google team. it's really appreciated! Please keep up the good work and thank you for your contributions to the OS community.

Google org

Hi @Dampfinchen , Thank you for your valuable feedback! We appreciate your support and valuable insights.

Sign up or log in to comment