--- base_model: - sometimesanotion/Lamarck-14B-v0.7 - sometimesanotion/Qwenvergence-14B-v12-Prose-DS - jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 - suayptalha/Lamarckvergence-14B library_name: transformers tags: - mergekit - merge license: apache-2.0 language: - en --- # EXPERIMENTAL: So what's this new arcee_fusion merge method, and what can we do with it? This model aims to find out, as a multi-stage merge where 3 out of 4 steps are fusions: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/665fef5a4794222f6a2fe605/5J98k-IqLMb_obGPaP1CM.png) * A fusion of [Lamarck-14B-v0.7](http://huggingface.co/sometimesanotion/Lamarck-14B-v0.7) and @suayptalha's [Lamarckvergence SLERP merge](http://huggingface.co/suayptalha/Lamarckvergence-14B) of Lamarck-14B-v0.7 and [Qwenvergence-14B-v12-Prose-DS](http://huggingface.co/sometimesanotion/Qwenvergence-14B-v12-Prose-DS). * A SLERP of Lamarck-14B-v0.7-Fusionvergence with Qwenvergence-14B-v12-Prose-DS, the latter emphasized in later layers. * A fusion of @jpacifico's [Chocolatine-2-14B-Instruct-v2.0.3](http://huggingface.co/jpacifico/Chocolatine-2-14B-Instruct-v2.0.3), itself a finetune of a merge of Lamarck-14B-v0.7, Arcee's (https://huggingface.co/arcee-ai/Virtuoso-Small-v2), and Qwenvergence-14B-v12-Prose-DS, fusion-merged with - you guessed it - Qwenvergence-14B-v12-Prose-DS * A fusion of the previous two. I've seen strong prose from this model, which is natural considering its re-emphasis of Qwenvergence-14B-v12-Prose-DS. A full evaluation will be cued shortly. This merge strategy is much simpler than a mainline Lamarck release, but that is necessary to see how multiple fusion merges behave. Where it fits for efforts towards a Lamarck v0.8 depends greatly on evaluation and feedback. ### Configuration The following YAML configuration was used to produce this model: ```yaml name: Lamarck-14B-v0.7-Fusionvergence merge_method: arcee_fusion base_model: sometimesanotion/Lamarck-14B-v0.7 tokenizer_source: base parameters: int8_mask: true normalize: true rescale: false dtype: bfloat16 out_dtype: bfloat16 models: - model: suayptalha/Lamarckvergence-14B --- name: Slerp-Lamarckvevergence base_model: sometimesanotion/Lamarck-14B-v0.7-Fusion-Lamarckvergence merge_method: slerp tokenizer_source: base dtype: float32 out_dtype: bfloat16 parameters: t: - filter: self_attn value: [ 0.00, 0.50, 0.30, 0.70, 1.00 ] - filter: mlp value: [ 1.00, 0.50, 0.70, 0.30, 0.00 ] - value: [ 0.00, 0.00, 0.00, 0.00, 0.04, 0.08, 0.12, 0.16, 0.24, 0.32, 0.40, 0.48, 0.56, 0.64, 0.72, 0.72, 0.72, 0.72, 0.72, 0.72, 0.72, 0.72, 0.64, 0.56, 0.48 ] slices: - sources: - model: sometimesanotion/Lamarck-14B-v0.7-Fusion-Lamarckvergence layer_range: [ 0, 48 ] - model: sometimesanotion/Qwenvergence-14B-v12-Prose-DS layer_range: [ 0, 48 ] --- name: Chocolatine-Fusion-Qwenvergence merge_method: arcee_fusion base_model: jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 tokenizer_source: base parameters: int8_mask: true normalize: true rescale: false dtype: bfloat16 out_dtype: bfloat16 models: - model: sometimesanotion/Qwenvergence-14B-v12-Prose-DS --- name: Lamarck-14B-v0.7-Fusion merge_method: arcee_fusion base_model: sometimesanotion/Slerp-Lamarckvevergence tokenizer_source: base parameters: int8_mask: true normalize: true rescale: false dtype: bfloat16 out_dtype: bfloat16 models: - model: sometimesanotion/Chocolatine-Fusion-Qwenvergence ```