LoneStriker
/

Llama-3-70B-Instruct-Storywriter-6.0bpw-h6-exl2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3-70B-Instruct-Storywriter-6.0bpw-h6-exl2 / README.md

LoneStriker's picture

Upload folder using huggingface_hub

b8e71b0 verified 6 months ago

|

1.2 kB

	# Llama 3 70B Instruct Storywriter
	Llama 3 70B Instruct, further finetuned on a dataset consisting of books in the fiction genre.

	This was just an experiment, but it turned out well enough that I'm sharing it. The finetuning has caused a significant shift in the model's writing style, and seems to have made it more creative. There may be a slight decrease in overall intelligence.

	Because this was trained on Instruct, you can use the normal Instruct chat formatting. It may also work well in raw completion mode.

	## Training details
	Trained on 4 4090s using [qlora-pipe](https://github.com/tdrussell/qlora-pipe).
	Dataset consists of about 800 books in the fiction genre, totaling 570 MB of raw text.
	Rank 64 QLoRA trained at 8192 sequence length.
	### Evaluation metrics

	<img src="https://i.imgur.com/sCMjix4.png" width="800" />

	## Why no 8B?
	I tried multiple times to train this on Llama 3 8B Instruct, using a variety of hyperparameters. It never worked well. The model took a huge hit to intelligence every time, to the point of being unusable. 70B fared much better. I don't know why, maybe 8B is just too small for this type of technique, and loses too much of the instruction-tuned smarts.