# Mistral-Merge-7B-slerp | |
## Model Description | |
The `Mistral-Merge-7B-slerp` is a merged model which leverages the spherical linear interpolation (SLERP) technique to blend layers from two distinct transformer-based models. This merging strategy is aimed at synthesizing a model that incorporates the robust linguistic capabilities of `OpenPipe/mistral-ft-optimized-1218` and the nuanced understanding of `mlabonne/NeuralHermes-2.5-Mistral-7B`. | |
## Configuration | |
The merging process was configured to apply a SLERP method across all comparable layers of the two source models. Below is the YAML configuration used for merging: | |
```yaml | |
slices: | |
- sources: | |
- model: OpenPipe/mistral-ft-optimized-1218 | |
layer_range: [0, 32] | |
- model: mlabonne/NeuralHermes-2.5-Mistral-7B | |
layer_range: [0, 32] | |
merge_method: slerp | |
base_model: OpenPipe/mistral-ft-optimized-1218 | |
parameters: | |
t: | |
- filter: self_attn | |
value: [0, 0.5, 0.3, 0.7, 1] | |
- filter: mlp | |
value: [1, 0.5, 0.7, 0.3, 0] | |
- value: 0.5 | |
dtype: bfloat16 | |
``` | |
This configuration ensures that both self-attention and MLP (multi-layer perceptron) layers undergo interpolation with a gradient of weights to optimize the integration of features from both models. | |