MarinaraSpaghetti
/

Nemomix-v1.0-12B-GGUF

Inference Endpoints

Model card Files Files and versions Community

Nemomix-v1.0-12B-GGUF / README.md

MarinaraSpaghetti's picture

MarinaraSpaghetti

Update README.md

c0814c9 verified 3 months ago

|

history blame contribute delete

2.79 kB

	---
	base_model: []
	library_name: transformers
	tags:
	- mergekit
	- merge
	---

	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/yCmy0NUWEu8_g3Fe_FklS.jpeg)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/ceugXFkm46gpEKPphl9R_.png)

	## Information
	### Description

	My main goal with this one was to merge the smartness of the base Instruct Nemo with the better prose from the different roleplaying fine-tunes. This is version v0.1, still to be tested. Weights shamelessly stolen from @ParasiticRogue (thank you, friend). All credits and thanks go to Intervitens, Mistralai, NeverSleep and ShuttleAI for providing amazing models used in the merge.

	### Instruct

	Both Mistral Instruct and ChatML should work.

	```
	<s>[INST] {system} [/INST]{assistant}</s>[INST] {user} [/INST]
	```

	Or...

	```
	<\|im_start\|>system
	{system}<\|im_end\|>
	<\|im_start\|>user
	{user}<\|im_end\|>
	<\|im_start\|>assistant
	{assistant}<\|im_end\|>
	```

	### Other Versions

	V1: https://huggingface.co/MarinaraSpaghetti/Nemomix-v1.0-12B

	V2: https://huggingface.co/MarinaraSpaghetti/Nemomix-v2.0-12B

	V3: https://huggingface.co/MarinaraSpaghetti/Nemomix-v3.0-12B

	### Settings

	Lower Temperature of 0.35 recommended, although I had luck with Temperatures above one (1.0-1.2) if you crank up the Min P (0.01-0.1). Run with base DRY of 0.8/1.75/2/0 and you're good to go.

	# Nemomix-v0.1-12B

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using F:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base.

	### Models Merged

	The following models were included in the merge:
	* F:\mergekit\intervitens_mini-magnum-12b-v1.1
	* F:\mergekit\mistralaiMistral-Nemo-Instruct-2407
	* F:\mergekit\NeverSleep_Lumimaid-v0.2-12B
	* F:\mergekit\shuttleai_shuttle-2.5-mini

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: F:\mergekit\shuttleai_shuttle-2.5-mini
	parameters:
	weight: 0.16
	density: 0.42
	- model: F:\mergekit\NeverSleep_Lumimaid-v0.2-12B
	parameters:
	weight: 0.22
	density: 0.54
	- model: F:\mergekit\intervitens_mini-magnum-12b-v1.1
	parameters:
	weight: 0.28
	density: 0.66
	- model: F:\mergekit\mistralaiMistral-Nemo-Instruct-2407
	parameters:
	weight: 0.34
	density: 0.78
	merge_method: dare_ties
	base_model: F:\mergekit\mistralaiMistral-Nemo-Base-2407
	parameters:
	int8_mask: true
	dtype: bfloat16
	```

	## Ko-fi
	### Enjoying what I do? Consider donating here, thank you!
	https://ko-fi.com/spicy_marinara