Not-For-All-Audiences

Inference Endpoints

Model card Files Files and versions Community

lewd-stories / README.md

fimbulvntr

Update README.md

f6c3ff3 verified 6 months ago

preview code

raw

history blame contribute delete

2.32 kB

	---
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- not-for-all-audiences
	---
	## Notes
	There is no template, just BOS+text

	It can also start from nothing

	Temperature, repetition penalty, etc should all be left as defaults

	It will not go lewd immediately, it will try to form a coherent story

	It's best to generate 1~3 paragraphs at a time, it loses coherence if you try to make it generate the full context all at once



	## LLaMA-3-8B base
	RoPEd to 16k context

	\| Name \| Quant \| Size \| VRAM (With FA) \| VRAM (No FA) \|
	\|---------------------------------------\|--------\|------\|----------------\|--------------\|
	\| llama-3-8b-lewd-stories-v6-16k.F16 \| F16 \| 14.9 \| 16.6 \| 17.4 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q8_0 \| Q8_0 \| 8.0 \| 10.1 \| 10.5 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q6_K \| Q6_K \| 6.1 \| 8.4 \| 9.2 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q5_K_M \| Q5_K_M \| 5.3 \| 7.6 \| 8.1 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q4_K_M \| Q4_K_M \| 4.6 \| 6.9 \| 7.8 \|

	## Yi-1.5-9B-32K

	Native 32k context

	\| Name \| Quant \| Size \| VRAM (With FA) \| VRAM (No FA) \|
	\|----------------------------\|--------\|------\|----------------\|--------------\|
	\| yi-lewd-stories-32k.F16 \| F16 \| 16.4 \| \| \|
	\| yi-lewd-stories-32k.Q8_0 \| Q8_0 \| 8.7 \| \| \|
	\| yi-lewd-stories-32k.Q6_K \| Q6_K \| 6.7 \| \| \|
	\| yi-lewd-stories-32k.Q5_K_M \| Q5_K_M \| 5.8 \| \| \|
	\| yi-lewd-stories-32k.Q4_K_M \| Q4_K_M \| 5.0 \| \| \|



	## Mistral-7B-v0.3

	Native 32k context

	\| Name \| Quant \| Size \| VRAM (With FA) \| VRAM (No FA) \|
	\|---------------------------------\|--------\|------\|----------------\|--------------\|
	\| mistral-lewd-stories-32k.F16 \| F16 \| 13.5 \| \| \|
	\| mistral-lewd-stories-32k.Q8_0 \| Q8_0 \| 7.2 \| \| \|
	\| mistral-lewd-stories-32k.Q6_K \| Q6_K \| 5.5 \| \| \|
	\| mistral-lewd-stories-32k.Q5_K_M \| Q5_K_M \| 4.8 \| \| \|
	\| mistral-lewd-stories-32k.Q4_K_M \| Q4_K_M \| 4.0 \| \| \|

	---
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- not-for-all-audiences
	---
	## Notes
	There is no template, just BOS+text

	It can also start from nothing

	Temperature, repetition penalty, etc should all be left as defaults

	It will not go lewd immediately, it will try to form a coherent story

	It's best to generate 1~3 paragraphs at a time, it loses coherence if you try to make it generate the full context all at once



	## LLaMA-3-8B base
	RoPEd to 16k context

	\| Name \| Quant \| Size \| VRAM (With FA) \| VRAM (No FA) \|
	\|---------------------------------------\|--------\|------\|----------------\|--------------\|
	\| llama-3-8b-lewd-stories-v6-16k.F16 \| F16 \| 14.9 \| 16.6 \| 17.4 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q8_0 \| Q8_0 \| 8.0 \| 10.1 \| 10.5 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q6_K \| Q6_K \| 6.1 \| 8.4 \| 9.2 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q5_K_M \| Q5_K_M \| 5.3 \| 7.6 \| 8.1 \|
	\| llama-3-8b-lewd-stories-v6-16k.Q4_K_M \| Q4_K_M \| 4.6 \| 6.9 \| 7.8 \|

	## Yi-1.5-9B-32K

	Native 32k context

	\| Name \| Quant \| Size \| VRAM (With FA) \| VRAM (No FA) \|
	\|----------------------------\|--------\|------\|----------------\|--------------\|
	\| yi-lewd-stories-32k.F16 \| F16 \| 16.4 \| \| \|
	\| yi-lewd-stories-32k.Q8_0 \| Q8_0 \| 8.7 \| \| \|
	\| yi-lewd-stories-32k.Q6_K \| Q6_K \| 6.7 \| \| \|
	\| yi-lewd-stories-32k.Q5_K_M \| Q5_K_M \| 5.8 \| \| \|
	\| yi-lewd-stories-32k.Q4_K_M \| Q4_K_M \| 5.0 \| \| \|



	## Mistral-7B-v0.3

	Native 32k context

	\| Name \| Quant \| Size \| VRAM (With FA) \| VRAM (No FA) \|
	\|---------------------------------\|--------\|------\|----------------\|--------------\|
	\| mistral-lewd-stories-32k.F16 \| F16 \| 13.5 \| \| \|
	\| mistral-lewd-stories-32k.Q8_0 \| Q8_0 \| 7.2 \| \| \|
	\| mistral-lewd-stories-32k.Q6_K \| Q6_K \| 5.5 \| \| \|
	\| mistral-lewd-stories-32k.Q5_K_M \| Q5_K_M \| 4.8 \| \| \|
	\| mistral-lewd-stories-32k.Q4_K_M \| Q4_K_M \| 4.0 \| \| \|