Nexesenex's picture
Upload README.md with huggingface_hub
02b46da verified
metadata
license: llama3.1
base_model: Nexesenex/Dolphin3.0-Llama3.1-1B-abliterated
tags:
  - llama-cpp
  - gguf-my-repo

Nexesenex/Dolphin3.0-Llama3.1-1B-abliterated-GGUF

IMPORTANT : These models are quantized with IK_Llama.cpp, not Llama.cpp

This model was converted to GGUF format from Nexesenex/Dolphin3.0-Llama3.1-1B-abliterated using llama.cpp's fork IK Llama via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Use with llama.cpp (I never tested that way with IK_Llama)

Install llama.cpp through brew (works on Mac and Linux)

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. -> necessary to use Croco.

Step 1: Clone llama.cpp from GitHub. -> necessary to use Croco.

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

Step 3: Run inference through the main binary.