metadata
license: other
license_name: yi-license
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
tags:
- merge
Kyllene 34B v1.1
Model Details
- A result of new merge method provided by MergeMonster tool with extended RPG preset.
- models used for merge: jondurbin/bagel-dpo-34b-v0.2 NousResearch/Nous-Capybara-34B NousResearch_Nous-Hermes-2-Yi-34B SUSTech/SUS-Chat-34B
- Method is aimed to maximize probability of certain phrases and minimize probablility of other phrases.
- RPG preset was extened with examples of typical, nonsensical output of most models like 'unbreakable bond', 'send shivers down her spine' etc.
- The resulting model has approximately 34 billion parameters.
- See mergekit-config.yml for details on the merge method used and RPG presets.
Warning: This model can produce NSFW content!
Results
- produces SFW nad NSFW content without issues, switches context seamlessly.
- 200K context length
- good at following instructions
- different than TeeZee/Kyllene-57B-v1.0, but also surprisingly entertaining (but more tests are needed)
Side notes
- MergeMonster method works, however project would benefit greatly from some more love from developers.
- In its current state MergeMonster consumes insane amounts of RAM (256GB+) or VRAM and takes a really long time to process model data, this merge took 24H on 1xADA6000
- MergeMonster is not a golden bullet, other experiments has shown that it can also produce incredibly stupid models.
All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel: