Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
GGUF of a merged [ausboss/llama2-13b-supercot-loras2](https://huggingface.co/ausboss/llama2-13b-supercot-loras2) with base Llama 2 13B. It is currently only quantised to Q5_K_M as this is the smallest size with comparable accuracy to 8bit (almost lossless). I have a fp16 GGUF and will probably quant to 8bit and 4bit GGUF soon.
|
2 |
+
|
3 |
+
Ausboss' original model card with the LoRA training info:
|
4 |
+
|
5 |
+
### Training procedure
|
6 |
+
|
7 |
+
The following bitsandbytes quantization config was used during training:
|
8 |
+
|
9 |
+
quant_method: bitsandbytes
|
10 |
+
load_in_8bit: False
|
11 |
+
load_in_4bit: True
|
12 |
+
llm_int8_threshold: 6.0
|
13 |
+
llm_int8_skip_modules: None
|
14 |
+
llm_int8_enable_fp32_cpu_offload: False
|
15 |
+
llm_int8_has_fp16_weight: False
|
16 |
+
bnb_4bit_quant_type: nf4
|
17 |
+
bnb_4bit_use_double_quant: True
|
18 |
+
bnb_4bit_compute_dtype: bfloat16
|
19 |
+
|
20 |
+
### Framework versions
|
21 |
+
|
22 |
+
PEFT 0.6.0.dev0
|