salzubi401 commited on
Commit
c19e18a
·
verified ·
1 Parent(s): a5b76eb

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: llama3.1
5
+ library_name: transformers
6
+ tags:
7
+ - Llama-3.1
8
+ - Instruct
9
+ - loyal AI
10
+ - GGUF
11
+ - finetune
12
+ - chat
13
+ - gpt4
14
+ - synthetic data
15
+ - roleplaying
16
+ - unhinged
17
+ - funny
18
+ - opinionated
19
+ - assistant
20
+ - companion
21
+ - friend
22
+ base_model: meta-llama/Llama-3.1-8B-Instruct
23
+ ---
24
+
25
+ # Dobby-Mini-Unhinged-Llama-3.1-8B_GGUF
26
+
27
+ Dobby-Mini-Unhinged is a compact, high-performance GGUF model based on Llama 3.1 with 8 billion parameters. Designed for efficiency, this model supports quantization levels in **4-bit**, **6-bit**, and **8-bit**, offering flexibility to run on various hardware configurations without compromising performance.
28
+
29
+ ## Compatibility
30
+
31
+ This model is compatible with:
32
+
33
+ - **[LMStudio](https://lmstudio.ai/)**: An easy-to-use desktop application for running and fine-tuning large language models locally.
34
+ - **[OLLAMA](https://ollama.com/)**: A versatile tool for deploying, managing, and interacting with large language models seamlessly.
35
+
36
+ ## Quantization Levels
37
+
38
+ | **Quantization** | **Description** | **Use Case** |
39
+ |------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
40
+ | **4-bit** | Highly compressed for minimal memory usage. Some loss in precision and quality, but great for lightweight devices with limited VRAM. | Ideal for testing, quick prototyping, or running on low-end GPUs and CPUs. |
41
+ | **6-bit** | Strikes a balance between compression and quality. Offers improved accuracy over 4-bit without requiring significant additional resources. | Recommended for users with mid-range hardware aiming for a compromise between speed and precision. |
42
+ | **8-bit** | Full-precision quantization for maximum quality while still optimizing memory usage compared to full FP16 or FP32 models. | Perfect for high-performance systems where maintaining accuracy and precision is critical. |
43
+
44
+ ## Recommended Usage
45
+
46
+ Choose your quantization level based on the hardware you are using:
47
+ - **4-bit** for ultra-lightweight systems.
48
+ - **6-bit** for balance on mid-tier hardware.
49
+ - **8-bit** for maximum performance on powerful GPUs.
50
+
51
+ This model supports prompt fine-tuning for domain-specific tasks, making it an excellent choice for interactive applications like chatbots, question answering, and creative writing.