QWen 2 model CAT-exl2?
I really like this model, reliable, heard Qwen might be a better challenger, wondering would you consider try a Qwen 2.0 72B CAT-exl2 model, similarly?
can you share the merging code , how you created this merged version? I am interested to explore this risky zone, with latest models... if you can, pls share here or my email [email protected]?
No need for email, the code is right there on the repo, just look at mergekit_config.yml. I think the merge command was something like "mergekit-yaml config.yaml merge --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle"
The only thing special about this is that cat has a custom tokenizer, but I think I just deleted it and put the llama 3 tokenizer instead and the merge went OK.
Sorry I didn't see your previous comment, I am changing my workflows to a mac from a system with nvidia cards, so I don't know if it's possible or not to make the same kind of work. I'll try to make a llama 3.1 version and maybe a Qwen 2, but I can't promise anything.