|
--- |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- not-for-all-audiences |
|
--- |
|
## Notes |
|
There is no template, just BOS+text |
|
|
|
It can also start from nothing |
|
|
|
Temperature, repetition penalty, etc should all be left as defaults |
|
|
|
It will not go lewd immediately, it will try to form a coherent story |
|
|
|
It's best to generate 1~3 paragraphs at a time, it loses coherence if you try to make it generate the full context all at once |
|
|
|
|
|
|
|
## LLaMA-3-8B base |
|
RoPEd to 16k context |
|
|
|
| Name | Quant | Size | VRAM (With FA) | VRAM (No FA) | |
|
|---------------------------------------|--------|------|----------------|--------------| |
|
| llama-3-8b-lewd-stories-v6-16k.F16 | F16 | 14.9 | 16.6 | 17.4 | |
|
| llama-3-8b-lewd-stories-v6-16k.Q8_0 | Q8_0 | 8.0 | 10.1 | 10.5 | |
|
| llama-3-8b-lewd-stories-v6-16k.Q6_K | Q6_K | 6.1 | 8.4 | 9.2 | |
|
| llama-3-8b-lewd-stories-v6-16k.Q5_K_M | Q5_K_M | 5.3 | 7.6 | 8.1 | |
|
| llama-3-8b-lewd-stories-v6-16k.Q4_K_M | Q4_K_M | 4.6 | 6.9 | 7.8 | |
|
|
|
## Yi-1.5-9B-32K |
|
|
|
Native 32k context |
|
|
|
| Name | Quant | Size | VRAM (With FA) | VRAM (No FA) | |
|
|----------------------------|--------|------|----------------|--------------| |
|
| yi-lewd-stories-32k.F16 | F16 | 16.4 | | | |
|
| yi-lewd-stories-32k.Q8_0 | Q8_0 | 8.7 | | | |
|
| yi-lewd-stories-32k.Q6_K | Q6_K | 6.7 | | | |
|
| yi-lewd-stories-32k.Q5_K_M | Q5_K_M | 5.8 | | | |
|
| yi-lewd-stories-32k.Q4_K_M | Q4_K_M | 5.0 | | | |
|
|
|
|
|
|
|
## Mistral-7B-v0.3 |
|
|
|
Native 32k context |
|
|
|
| Name | Quant | Size | VRAM (With FA) | VRAM (No FA) | |
|
|---------------------------------|--------|------|----------------|--------------| |
|
| mistral-lewd-stories-32k.F16 | F16 | 13.5 | | | |
|
| mistral-lewd-stories-32k.Q8_0 | Q8_0 | 7.2 | | | |
|
| mistral-lewd-stories-32k.Q6_K | Q6_K | 5.5 | | | |
|
| mistral-lewd-stories-32k.Q5_K_M | Q5_K_M | 4.8 | | | |
|
| mistral-lewd-stories-32k.Q4_K_M | Q4_K_M | 4.0 | | | |