backyardai
/

LemonadeRP-4.5.3-GGUF

Inference Endpoints

Model card Files Files and versions Community

PacmanIncarnate commited on Mar 13

Commit

772fba3

•

1 Parent(s): 669e242

Update README.md

Files changed (1) hide show

README.md +25 -5

README.md CHANGED Viewed

@@ -1,8 +1,10 @@
-<img src="Faraday Model Repository Banner.png" alt="Faraday.dev" style="width: 20%; min-width: 32px; display: block; horizontal align: left;">
-# Faraday.dev Model Repository
-Conveniently download this model from the Faraday.dev app model manager.
-- [Download Faraday here to get started.](https://faraday.dev/)
-- Request Additional GGUF models at [r/LLM_Quants](https://www.reddit.com/r/LLM_Quants/s/iizaX3acGa)
 ***
@@ -20,6 +22,24 @@ GGUF is a large language model (LLM) format that can be split between CPU and GP
 GGUF models are quantized to reduce resource usage, with a tradeoff of reduced coherence at lower quantizations. Quantization reduces the precision of the model weights by changing the number of bits used for each weight.
 ***
 ## Faraday.dev

+<img src="Faraday Model Repository Banner.png" alt="Faraday.dev" style="height: 150px; min-width: 32px; display: block; margin: auto;">
+**<p style="text-align: center;">The official library of GGUF format models for use in the local AI chat app, Faraday.dev.</p>**
+<p style="text-align: center;"><a href="https://faraday.dev/">Download Faraday here to get started.</a></p>
+<p style="text-align: center;"><a href="https://www.reddit.com/r/LLM_Quants/">Request Additional models at r/LLM_Quants.</a></p>
 ***
 GGUF models are quantized to reduce resource usage, with a tradeoff of reduced coherence at lower quantizations. Quantization reduces the precision of the model weights by changing the number of bits used for each weight.
+### 7B Quantization Chart
+*Memory required must be less than your available RAM.*
+| Quant method | Size | Memory required at 4K Context |
+| --- | --- | --- |
+| Q2_K | 2.72 GB| 5.22 GB |
+| Q3_K_S | 3.16 GB| 5.66 GB |
+| Q3_K_M | 3.52 GB| 6.02 GB |
+| Q3_K_L | 3.82 GB| 6.32 GB |
+| Q4_0 | 4.11 GB| 6.61 GB |
+| Q4_K_S | 4.14 GB| 6.64 GB |
+| Q4_K_M | 4.37 GB| 6.87 GB |
+| Q5_0 | 5.00 GB| 7.50 GB |
+| Q5_K_S | 5.00 GB| 7.50 GB |
+| Q5_K_M | 5.13 GB| 7.63 GB |
+| Q6_K | 5.94 GB| 8.44 GB |
+| Q8_0 | 7.70 GB| 10.20 GB |
 ***
 ## Faraday.dev