Linux, CUDA, AVX512. ```sh $ git clone "https://github.com/abetlen/llama-cpp-python" --depth=1 $ cd llama-cpp-python $ git submodule update --init --recursive $ CMAKE_ARGS="-DLLAMA_CUBLAS=on" python -m pip install . ```