use with llama-cpp-python server

#70
by SurtMcGert - opened

I have been looking for an open source model capable of understanding and extracting information from images of documents that include ticks and crosses and hand written notes etc... This model is the only one capable of anything decent. As such, I wanted to set the model up as a server. Through attempting this I have discovered the world of llama-cpp and gguf files etc and I can see that this model is supported by llama-cpp. I wanted to try getting the model loaded using llama-cpp-python[server] in a docker image which I have no successfully done, however I am not getting particularly good results and I am certain its because I have set something up wrong somewhere. Since I am new to all this I cant really work out what I am doing wrong.

this is my config file for the model:
{
"host": "0.0.0.0",
"port": 2080,
"models": [
{
"model": "./model/ggml-model-Q6_K.gguf",
"model_alias": "MiniCPM",
"chat_format": "llava-1-5",
"clip_model_path": "./model/mmproj-model-f16.gguf",
"n_gpu_layers": -1,
"n_threads": -1,
"n_batch": 512,
"offload_kqv": true,
"n_ctx": 2048
}
]
}

this is the command I use to launch the server
python3 -m llama_cpp.server --config_file $CONFIG_FILE

and then I make a request using the openAI api like so:
def image_to_base64_data_uri(file_path):
with open(file_path, "rb") as img_file:
base64_data = base64.b64encode(img_file.read()).decode('utf-8')
return f"data:image/png;base64,{base64_data}"

prompt = "Using only a json format, give me the list of vehicle checks in this document that are not applicable."
file_path = 'test_images/service.jpg'
data_uri = image_to_base64_data_uri(file_path)

response = client.chat.completions.create(
model="MiniCPM",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": data_uri
},
},
{"type": "text", "text": prompt},
],
}
],
)
print(response.choices[0].message)

however, I am getting poor responses from the server, much poorer than even when I just tried the Q4 model in kaggle using the steps outlined here
https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv

some examples of the responses I have got are as follows: (these were all from using the prompt I have in the above example code)
ChatCompletionMessage(content='3.0.7.0.70.5.2.4.9.8.3.1.6.0.', refusal=None, role='assistant', function_call=None, tool_calls=None)

ChatCompletionMessage(content=' The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check level", "Power windows - Up/Down", "Mirrors - Adjusting side mirrors", "Wipers - Test front wiper blade", "Lights - Rear lights", and "Tires - Check tyre pressure".ASSISTANT: The following vehicle checks in the document are not applicable: "Check steering", "Automatic Transmission - Check level", "Coolant system - Check', refusal=None, role='assistant', function_call=None, tool_calls=None)

so sometimes just a list of numbers, sometimes repeating itself a lot with content that isnt even present in the image. I know the model can do what I am looking for because the live demo that you can use to test the full sized model worked excellently, and even the Q4 model was pretty good but ever so slightly off which is why im now trying the Q6 model. But these responses are completely off so I dont know what I have set up wrong.

Sign up or log in to comment