Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Original model: https://huggingface.co/brucethemoose/Yi-34B-200K-RPMerge

Steps:

  1. Convert to GGUF using llama.cpp (clone from source, install requirements, then run this)

    python convert.py /mnt/d/LLM_Models/Yi-34B-200K-RPMerge/ --vocab-type hfft --outtype f32 --outfile Yi-34B-200K-RPMerge.gguf

  2. Create imatrix (offload as much as you can to the GPU)

    ./imatrix -m /mnt/d/LLM_Models/Yi-34B-200K-RPMerge.gguf -f /mnt/d/LLM_Models/8k_random_data.txt -o /mnt/d/LLM_Models/Yi-34B-200K-RPMerge.imatrix.dat -ngl 20

  3. Quantize using imatrix

    `./quantize --imatrix /mnt/d/LLM_Models/Yi-34B-200K-RPMerge.imatrix.dat /mnt/d/LLM_Models/Yi-34B-200K-RPMerge.gguf /mnt/d/LLM_Models/Yi-34B-200K-RPMerge.IQ2_XXS.gguf IQ2_XXS

I have also uploaded 8k_random_data.txt from this github discussion And the importance matrix I made (Yi-34B-200K-RPMerge.imatrix.dat)

Downloads last month
6
GGUF
Model size
34.4B params
Architecture
llama

2-bit

Inference API
Unable to determine this model's library. Check the docs .