New gguf chat template ignores the 'system' role when preparing chat completions
Pretty much what it says in the title. As of the most recent upload, the template in the published quants lists the chat template as:
{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') %}{{'<|user|>' + '
' + message['content'] + '<|end|>' + '
' + '<|assistant|>' + '
'}}{% elif (message['role'] == 'assistant') %}{{message['content'] + '<|end|>' + '
'}}{% endif %}{% endfor %}
...which has the net result of ignoring any system prompt passed in.
Root cause discovered upstream here: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/discussions/51
The model has not been optimized for the system instruction and produces better generations without it.
That’s why we opted to remove altogether any reference to system. Try appending it to your first user prompt, should work better than a separate system instruction.
Still breaks the API contract with any tool that uses the role. Recommend adding your proposed fix to the chat template. Consider that your proposed solution will require every downstream consumer of your model to make a breaking change to their application. The fix definitely belongs in the chat template.