--- language: - en pipeline_tag: text-generation tags: - not-for-all-audiences --- ## Notes There is no template, just BOS+text It can also start from nothing Temperature, repetition penalty, etc should all be left as defaults It will not go lewd immediately, it will try to form a coherent story It's best to generate 1~3 paragraphs at a time, it loses coherence if you try to make it generate the full context all at once ## LLaMA-3-8B base RoPEd to 16k context | Name | Quant | Size | VRAM (With FA) | VRAM (No FA) | |---------------------------------------|--------|------|----------------|--------------| | llama-3-8b-lewd-stories-v6-16k.F16 | F16 | 14.9 | 16.6 | 17.4 | | llama-3-8b-lewd-stories-v6-16k.Q8_0 | Q8_0 | 8.0 | 10.1 | 10.5 | | llama-3-8b-lewd-stories-v6-16k.Q6_K | Q6_K | 6.1 | 8.4 | 9.2 | | llama-3-8b-lewd-stories-v6-16k.Q5_K_M | Q5_K_M | 5.3 | 7.6 | 8.1 | | llama-3-8b-lewd-stories-v6-16k.Q4_K_M | Q4_K_M | 4.6 | 6.9 | 7.8 | ## Yi-1.5-9B-32K Native 32k context | Name | Quant | Size | VRAM (With FA) | VRAM (No FA) | |----------------------------|--------|------|----------------|--------------| | yi-lewd-stories-32k.F16 | F16 | 16.4 | | | | yi-lewd-stories-32k.Q8_0 | Q8_0 | 8.7 | | | | yi-lewd-stories-32k.Q6_K | Q6_K | 6.7 | | | | yi-lewd-stories-32k.Q5_K_M | Q5_K_M | 5.8 | | | | yi-lewd-stories-32k.Q4_K_M | Q4_K_M | 5.0 | | | ## Mistral-7B-v0.3 Native 32k context | Name | Quant | Size | VRAM (With FA) | VRAM (No FA) | |---------------------------------|--------|------|----------------|--------------| | mistral-lewd-stories-32k.F16 | F16 | 13.5 | | | | mistral-lewd-stories-32k.Q8_0 | Q8_0 | 7.2 | | | | mistral-lewd-stories-32k.Q6_K | Q6_K | 5.5 | | | | mistral-lewd-stories-32k.Q5_K_M | Q5_K_M | 4.8 | | | | mistral-lewd-stories-32k.Q4_K_M | Q4_K_M | 4.0 | | |