Peeepy
/

SuperCOT-L2-13B-GGUF

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Peeepy commited on Sep 14, 2023

Commit

2e271ae

·

1 Parent(s): acdd45e

Create README.md

Files changed (1) hide show

README.md +22 -0

README.md ADDED Viewed

	@@ -0,0 +1,22 @@

+GGUF of a merged [ausboss/llama2-13b-supercot-loras2](https://huggingface.co/ausboss/llama2-13b-supercot-loras2) with base Llama 2 13B. It is currently only quantised to Q5_K_M as this is the smallest size with comparable accuracy to 8bit (almost lossless). I have a fp16 GGUF and will probably quant to 8bit and 4bit GGUF soon.
+Ausboss' original model card with the LoRA training info:
+### Training procedure
+The following bitsandbytes quantization config was used during training:
+    quant_method: bitsandbytes
+    load_in_8bit: False
+    load_in_4bit: True
+    llm_int8_threshold: 6.0
+    llm_int8_skip_modules: None
+    llm_int8_enable_fp32_cpu_offload: False
+    llm_int8_has_fp16_weight: False
+    bnb_4bit_quant_type: nf4
+    bnb_4bit_use_double_quant: True
+    bnb_4bit_compute_dtype: bfloat16
+### Framework versions
+    PEFT 0.6.0.dev0