grimjim/franken-kunoichi-IDUS-11B
Text Generation
•
Updated
•
15
Models that didn't always quite work out, but may still be of interest.
Note Proof by counterexample that fine-tuning is mandatory for coherent behavior after a drastic frankenmerge.
Note Unimpressive result of model stock merger. Probably too few models used.
Note Failed attempt to extend context length to 32k tokens.
Note Experiment with negative merge weighting. Basis of proposal for Orthogonal Vector Adaptation (OVA).