Sweaterdog's picture
Update README.md
4767d35 verified
|
raw
history blame
4.1 kB
---
base_model:
- unsloth/Qwen2.5-7B-bnb-4bit
- unsloth/gemma-2-9b-it-bnb-4bit
- unsloth/Llama-3.2-3B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- gemma2
- llama3
- trl
license: apache-2.0
language:
- en
datasets:
- Sweaterdog/MindCraft-LLM-tuning
---
# Uploaded model
- **Developed by:** Sweaterdog
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Qwen2.5-7B-bnb-4bit and unsloth/gemma-2-9b-it-bnb-4bit and unsloth/Llama-3.2-3B-Instruct
The MindCraft LLM tuning CSV file can be found here, this can be tweaked as needed. [MindCraft-LLM](https://huggingface.co/datasets/Sweaterdog/MindCraft-LLM-tuning/raw/main/Gemini-Minecraft%20-%20training_data_minecraft_updated.csv)
# What is the Purpose?
This model is built and designed to play Minecraft via the extension named "[MindCraft](https://github.com/kolbytn/mindcraft)" Which allows language models, like the ones provided in the files section, to play Minecraft.
- Why a new model?
#
While, yes, models that aren't fine tuned to play Minecraft *Can* play Minecraft, most are slow, innaccurate, and not as smart, in the fine tuning, it expands reasoning, conversation examples, and command (tool) usage.
- What kind of Dataset was used?
#
I'm deeming this model *"Hermes"*, it was trained for reasoning by using examples of in-game "Vision" as well as examples of spacial reasoning, for expanding thinking, I also added puzzle examples where the model broke down the process step by step to reach the goal.
- Why choose Qwen2.5 for the base model?
#
During testing, to find the best local LLM for playing Minecraft, I came across two, Gemma 2, and Qwen2.5, these two were by far the best at playing Minecraft before fine-tuning, and I knew, once tuned, it would become better.
- If Gemma 2 and Qwen 2.5 are the best before fine tuning, why include Llama 3.2, especially the lower intelligence, 3B parameter version?
#
That is a great question, I know since Llama 3.2 3b has low amounts of parameters, it is dumb, and doesn't play minecraft well without fine tuning, but, it is a lot smaller than other models which are for people with less powerful computers, and the hope is, once the model is tuned, it will become much better at minecraft.
# How to Use
In order to use this model, A, download the GGUF file of the version you want, either a Qwen, or Gemma model, and then the Modelfile, after you download both, in the Modelfile, change the directory of the model, to your model. Here is a simple guide if needed for the rest:
#
1.Download the .gguf Model u want. For this example it is in the standard Windows "Download" Folder
2.Download the Modelfile
3.Open the Modelfile with / in notepad, or you can rename it to Modelfile.txt, and change the GGUF path, for example, this is my PATH "C:\Users\SweaterDog\OneDrive\Documents\Raw GGUF Files\Hermes-1.0\Hermes-1.Q8_0.gguf"
4.Safe + Close Modelfile
5.Rename "Modelfile.txt" into "Modelfile" if you changed it before-hand
6.Open CMD and type in "ollama create Hermes1 -f Modelfile" (You can change the name to anything you'd like, for this example, I am just using the same name as the GGUF)
7.Wait until finished
8.In the CMD window, type "ollama run Hermes1" (replace the 1 in Hermes with whatever version you downloaded)
#
I'm aware it does say there are multiple Qwen2.5 files, even though there are two, and it also says there are Gemma2 models, even though there isn't, I am aware and have been trying to train the rest of these models.
#
For Anybody who is wondering what the context length is, for the Hermesv1, they have a context window of 8196 tokens, but when the v2 generation drops, including LLaMa 3.2 and Gemma2, they will use a larger dataset, and have a context length of 128000 tokens
#
This qwen2 and gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)