You can train / modify llm with 6gb vram?

#2
by dadadies - opened

I have 6gb vram (rtx 4050 - couldn't afford a 4060 8gb vram). I dont plan to do anything with it. But if that is true its good to know. And whats the difference with this compared to the original FuseChat besides the GGUF format. I know practically nothing about llms. Just bought my gpu a week or so ago and messing with llama 3 as my first local llm.

@dadadies

This isn't trained on anything.

What I did was create an Imatrix with the Capybara dataset.

An Imatrix is just something made to reduce information loss upon quantizing.

Training on 6GB, is not feasible for anything bigger than TinyLlama.
You might be able to get a 4B parameter model training if you use unsloth and 4bit qlora.

As training on 6GB is slow even if you manage to get it working use the google colab notebook made by unsloth.
https://huggingface.co/unsloth/llama-3-8b-bnb-4bit

You can "modify" LLM's by merging instead.

Just find two models you like and use Mergekit

It is a shame not being able to train, but you can still get pretty interesting models by merging.

Thanks for the info. Then i might try merging one of these days.

I'm closing this.

You can open a new discussion on my latest merge, if you need help with mergekit.

Virt-io changed discussion status to closed

Sign up or log in to comment