Edit model card

aranea-ancilla-116b-v1.0

aka MiquMaid-v1-70B + interleaved WinterGoddess-1.4x-70B-L2

image/png

A mergekit frankenmerge based on NeverSleep/MiquMaid-v1-70B with interleaved layers of Sao10K/WinterGoddess-1.4x-70B-L2.
This was the top performing model from a series of merge experiments to create a highly coherant creative writing model.

Tests consisted of a series of private benchmarks and manual comparisons. A number of different base models, interleave models and layer offsets were compared.

  • Usable context ~32768
  • Recommended context ~16384

Non frankenstein miqu-1 finetunes generally outperform their frankenstein counterparts at very long contexts due to coherency loss.
As a rough suggestion I might suggest swapping out to either NeverSleep/MiquMaid-v1-70B or 152334H/miqu-1-70b-sf after 16k context.

Layers: 136

License

No license. Component models based on the Mistral AI Miqu-1 llama2 finetune that was released without license.

Interesting observations from benchmarking

  • 10 layer interleave stride with a 20 layer interleave width consistently outperformed alternatives combinations.
  • Offsetting the interleaved model's first set of layers generally improved coherency. [14-30] reliably beat the [10-30] mergekit slice configuration for various combinations of models.
  • Quality of resulting merges can vary wildly. Whilst a merge of two strong models tends to produce a strong frankenstein model, this rule does not always hold true.

Quantizations

Exllamav2 quants will be available when bandwidth permits.

Downloads last month
13
Safetensors
Model size
117B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for divinetaco/aranea-ancilla-116b-v1.0