Not-For-All-Audiences

llama-cpp

gguf-my-repo

Inference Endpoints

conversational

Model card Files Files and versions Community

Jamet-8B-L3-MK.V-Blackroot-Q4_K_S-GGUF / README.md

Triangle104

Update README.md

b63eb35 verified 12 days ago

preview code

raw

history blame contribute delete

4.21 kB

	---
	license: llama3
	library_name: transformers
	tags:
	- not-for-all-audiences
	- llama-cpp
	- gguf-my-repo
	base_model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
	---

	# Triangle104/Jamet-8B-L3-MK.V-Blackroot-Q4_K_S-GGUF
	This model was converted to GGUF format from [`Hastagaras/Jamet-8B-L3-MK.V-Blackroot`](https://huggingface.co/Hastagaras/Jamet-8B-L3-MK.V-Blackroot) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/Hastagaras/Jamet-8B-L3-MK.V-Blackroot) for more details on the model.

	---
	Different base model, different methods (without model stock merge, because it's not good...at least for me, it gives repeating sentences or words at 0 temperature, that also happened with the Halu Blackroot). This model maybe more similar to the Anjir model.

	This model has been on a long journey; I have like...9 variations of it. I tested all of them at Q4_K_M and decided to release this one, this is variation number 7.

	And thanks for all the feedback from the previous model, it helps a lot. (I couldn't fix the issue with German because I don't know how to fix it since I can't speak German, heck, even my English is bad.)

	More Details:

	This model is based on this model, which is based on the UltimateAnjir model. It shares the same creative, cheerful, and positive tendencies. Then I merged it with Llama 3 Instruct.
	Next is DPO! to reduce the cheerfulness, emojis, and positivity. (This is based on the Jamet MK.II Feedback regarding positivity.) I trained a QLora with about 1,000 prompts from Alpaca to generate a dataset, selecting the ones with emojis, removing the emojis using regex and moving those without emojis to the chosen, and the responses with emojis to the rejected.(With unsloth)
	Then, I applied the Abomination Lora from Blackroot.
	Next, I applied the Anjir Adapter (64 Rank version with reduced Alpha) to improve formatting while retaining the previous Lora influences. (This is based on the Anjir Feedback, which suggests that Anjir has better formatting than the Halu Blackroot.)
	And then merged the model with the Anjrit model. (I won't release the Anjrit model as it struggles with longer contexts. I'm only interested in its no refusals storytelling abilities, but you can find a brief overview of the model on my Anjir model page.)

	And that's it. Thanks again for all the feedback!

	Notes:

	I'm not responsible for anything.
	This is an RP and Storytelling model.
	You can write your feedback at the discussion, so i can improve my models.
	Like all of my previous models, higher temperatures will be incoherent, so use around 0.85-1.05.(I've been trying to fix this since Halu Blackroot but haven't had much luck, I think merging the base with Llama 3 Instruct helped a lot.)

	---
	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo Triangle104/Jamet-8B-L3-MK.V-Blackroot-Q4_K_S-GGUF --hf-file jamet-8b-l3-mk.v-blackroot-q4_k_s.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo Triangle104/Jamet-8B-L3-MK.V-Blackroot-Q4_K_S-GGUF --hf-file jamet-8b-l3-mk.v-blackroot-q4_k_s.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo Triangle104/Jamet-8B-L3-MK.V-Blackroot-Q4_K_S-GGUF --hf-file jamet-8b-l3-mk.v-blackroot-q4_k_s.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo Triangle104/Jamet-8B-L3-MK.V-Blackroot-Q4_K_S-GGUF --hf-file jamet-8b-l3-mk.v-blackroot-q4_k_s.gguf -c 2048
	```