Sweaterdog
/

MindCraft-LLM-tuning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MindCraft-LLM-tuning / README.md

Sweaterdog's picture

Update README.md

b634819 verified about 2 months ago

|

2.18 kB

	---
	base_model: unsloth/Qwen2.5-7B-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- qwen2
	- trl
	license: apache-2.0
	language:
	- en
	---

	# Uploaded model

	- Developed by: Sweaterdog
	- License: apache-2.0
	- Finetuned from model : unsloth/Qwen2.5-7B-bnb-4bit

	The MindCraft LLM tuning CSV file can be found here, this can be tweaked as needed. [MindCraft-LLM](https://huggingface.co/datasets/Sweaterdog/MindCraft-LLM-tuning/raw/main/Gemini-Minecraft%20-%20training_data_minecraft_updated.csv)

	# What is the Purpose?

	This model is built and designed to play Minecraft via the extension named "[MindCraft](https://github.com/kolbytn/mindcraft)" Which allows language models, like the ones provided in the files section, to play Minecraft.
	- Why a new model?
	#
	While, yes, models that aren't fine tuned to play Minecraft Can play Minecraft, most are slow, innaccurate, and not as smart, in the fine tuning, it expands reasoning, conversation examples, and command (tool) usage.
	- What kind of Dataset was used?
	#
	I'm deeming this model "Hermes", it was trained for reasoning by using examples of in-game "Vision" as well as examples of spacial reasoning, for expanding thinking, I also added puzzle examples where the model broke down the process step by step to reach the goal.
	- Why choose Qwen2.5 for the base model?
	#
	During testing, to find the best local LLM for playing Minecraft, I came across two, Gemma 2, and Qwen2.5, these two were by far the best at playing Minecraft before fine-tuning, and I knew, once tuned, it would become better.


	Here is the link to the Google Colab notebook for fine tuning your own model, in case you want to use a different one, such as Llama-3-8b, or if you want to change the hyperparameters
	[Google Colab](https://colab.research.google.com/drive/1ZoP7vO50kQrtHoQ54EI6URnoWzIJUg-c?usp=sharing)

	#
	This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)