ValidationError when running this model using llama-ccp-python
Hi,
I am trying to follow this doc https://python.langchain.com/docs/integrations/llms/llamacpp to install and run TheBloke/dolphin-2.7-mixtral-8x7b-GGUF model.
I could install llama-cpp-python with this commad CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
after installing sudo apt-get -y install nvidia-cuda-toolkit
My hardware and OS info is:
OS: Kubuntu 22.04.3 LTS x86_64
CPU: 12th Gen Intel i9-12900KF (24) @ 5.100GHz
GPU: NVIDIA GeForce RTX 3080 Lite Hash Rate
Memory: 4938MiB / 64151MiB
When I run the python code:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms import LlamaCpp
template = """Question: {question}
Answer: Let's work this out in a step by step way to be sure we have the right answer."""
prompt = PromptTemplate(template=template, input_variables=["question"])
# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
n_gpu_layers = -1 # The number of layers to put on the GPU. The rest will be on the CPU. If you don't know how many layers there are, you can use -1 to move all to GPU.
n_batch = 512 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
# Make sure the model path is correct for your system!
llm = LlamaCpp(
model_path="/home/cristian/development/ai/models/dolphin-2.7-mixtral-8x7b.Q4_K_M.gguf",
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
verbose=True, # Verbose is required to pass to the callback manager
)
It throws the error:
```
ValidationError Traceback (most recent call last)
Cell In[28], line 5
2 n_batch = 512 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
4 # Make sure the model path is correct for your system!
----> 5 llm = LlamaCpp(
6 model_path="/home/cristian/development/ai/models/dolphin-2.7-mixtral-8x7b.Q4_K_M.gguf",
7 n_gpu_layers=n_gpu_layers,
8 n_batch=n_batch,
9 callback_manager=callback_manager,
10 verbose=True, # Verbose is required to pass to the callback manager
11 )
File ~/development/pyenvs/ia_gft_pyenv/lib/python3.10/site-packages/langchain_core/load/serializable.py:107, in Serializable.init(self, **kwargs)
106 def init(self, **kwargs: Any) -> None:
--> 107 super().init(**kwargs)
108 self._lc_kwargs = kwargs
File ~/development/pyenvs/ia_gft_pyenv/lib/python3.10/site-packages/pydantic/v1/main.py:341, in BaseModel.init(pydantic_self, **data)
339 values, fields_set, validation_error = validate_model(pydantic_self.class, data)
340 if validation_error:
--> 341 raise validation_error
342 try:
343 object_setattr(pydantic_self, 'dict', values)
ValidationError: 1 validation error for LlamaCpp
root
Could not load Llama model from path: /home/cristian/development/ai/models/dolphin-2.7-mixtral-8x7b.Q4_K_M.gguf. Received error (type=value_error)
I tried to downgrade from the latest version `0.2.38` of llama-cpp-python to version `0.2.26` but still same error.
Kindly anyone can help me to figure out what should I do?