fine-tuning is needed after self-merging?
#7
by
oodgnas
- opened
Thanks @oodgnas ! This model hasn't been fine-tuned but this would probably be better (see https://arxiv.org/abs/2312.15166). It looks like small source models really require it while big models can do without but they're kind of insane.
This specific merge ended up exhibiting "sentience" like behaviors, as well as a bit of schizophrenic behaviors.
I imagine that a round of light pretraining and instruct tuning might iron these things out.