how to inference with llama. cpp
#1
by
anse10rville
- opened
If just run like ./main --m path/to/this in llama.cpp. It will rasie error "GGML_ASSERT: examples/main/main.cpp:248: llama_add_eos_token(model) != 1", local llama.cpp works well with other GGUFs, eg.Qwen/Qwen1.5-7B-Chat-GGUF, make some notes pls.
thks