Update 2023-12-19

In light of dataset contamination issue among the merged models raised by the community in recent days, in particular berkeley-nest/Starling-LM-7B-alpha, and Q-bert/MetaMath-Cybertron-Starling, we decided to remake another model without the models mentioned. Additionally, their CC-by-NC-4.0 license is restrictive and thus are not suitable for an open model.

Model Description

This is an experiment to test merging 14 models using DARE TIES πŸ¦™

The result is a base model that performs quite well but requires some further instruction fine-tuning.

The 14 models are as follows:

  1. mistralai/Mistral-7B-Instruct-v0.2
  2. ehartford/dolphin-2.2.1-mistral-7b
  3. SciPhi/SciPhi-Mistral-7B-32k
  4. ehartford/samantha-1.2-mistral-7b
  5. Arc53/docsgpt-7b-mistral
  6. berkeley-nest/Starling-LM-7B-alpha
  7. Q-bert/MetaMath-Cybertron-Starling
  8. Open-Orca/Mistral-7B-OpenOrca
  9. v1olet/v1olet_marcoroni-go-bruins-merge-7B
  10. beowolx/MistralHermes-CodePro-7B-v1
  11. TIGER-Lab/MAmmoTH-7B-Mistral
  12. teknium/OpenHermes-2.5-Mistral-7B
  13. Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
  14. mlabonne/NeuralHermes-2.5-Mistral-7B

The yaml config file for this model is here:

models:
  - model: mistralai/Mistral-7B-v0.1
    # no parameters necessary for base model
  - model: ehartford/dolphin-2.2.1-mistral-7b
    parameters:
      weight: 0.08
      density: 0.4
  - model: SciPhi/SciPhi-Mistral-7B-32k
    parameters:
      weight: 0.08
      density: 0.4
  - model: ehartford/samantha-1.2-mistral-7b
    parameters:
      weight: 0.08
      density: 0.4
  - model: Arc53/docsgpt-7b-mistral
    parameters:
      weight: 0.08
      density: 0.4
  - model: berkeley-nest/Starling-LM-7B-alpha
    parameters:
      weight: 0.08
      density: 0.4
  - model: Q-bert/MetaMath-Cybertron-Starling
    parameters:
      weight: 0.08
      density: 0.4
  - model: Open-Orca/Mistral-7B-OpenOrca
    parameters:
      weight: 0.08
      density: 0.4
  - model: v1olet/v1olet_marcoroni-go-bruins-merge-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: beowolx/MistralHermes-CodePro-7B-v1
    parameters:
      weight: 0.08
      density: 0.4
  - model: TIGER-Lab/MAmmoTH-7B-Mistral
    parameters:
      weight: 0.08
      density: 0.4
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
    parameters:
      weight: 0.08
      density: 0.4
  - model: mlabonne/NeuralHermes-2.5-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: mistralai/Mistral-7B-Instruct-v0.2
    parameters:
      weight: 0.08
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16
Downloads last month
16
Safetensors
Model size
7.24B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for EmbeddedLLM/Mistral-7B-Merge-14-v0

Spaces using EmbeddedLLM/Mistral-7B-Merge-14-v0 2