Prompt formatting

#1
by mishima - opened

Love this new model! But Is Alpaca format is best for chat rp? In my case chatlm works better for some reason.

Oh, thank you! I appreciate both the compliment and the feedback.

That's a really interesting insight, as DPO has a real problem with over-fitting. cDPO improves this, but doesn't fix it entirely.

It would make a lot of sense, then, that despite being trained in Alpaca it could perform better in ChatML. ChatML might "shake it up" enough to counteract that.

There's science to be done!

Thanks for answer, let us now if your testing prove this for future versions :)
Still using ChatML format and found it better and after 2 days of playing, here are couple more things to mention (and some already mentioned in v2 vs v3 topic):

There are weird repetitions in the RP, for example:

  1. "She moans loudly, her body shaking with pleasure, "S-S-S-S-S-S-S-S-S-S-S!"
    Her fingers grip your hair tighter, her nails digging into his scalp as she loses control of her body: "P-Please...! P-P-P-P-P-P-P-P-P-P-P-P-P…""

Earlier versions of the model suffer from it even more, as it will be unlimited repetition of some "P-P-P-P-P-P-P-P-P-P-P-P-P" and if you didn't fix it by editing this message of char (I use Silly) - it will progress until take all space. Now just edit once and it is fixed, at least for a long time :)

  1. Narrator stuff (between **) is sometimes stuck forever (or until you edit it), while character speech part ("") is completely ok.

Don't really know if this can be fully fixed in future versions, but hope this info is helpful somehow. Cause it's really great model, in my use case it's outperform most of 70b, maybe because it's naughty and smart at the same time :)

I'm working on the next gen's dataset, and it's gonna get a bunch of these:

{"prompt":"### Instruction:\nObey the user's input\n\n### Output:\n","chosen":"Say "yes please".\n","rejected":""yes-s-s-s p-p-p-please"\n"}

the narration problem is a bigger deal. "To prevent that, I think ALL the finetune data would have to be in novel style," she said; or alternative all in markdown shakes head

Sign up or log in to comment