Abliterated
Hey Undi!
I'd suggest you to coin a name for your per layer/per tensor abliteration, so people can distinguish between your methodology and the older/other methodologies which are (often) dumbing down models a bit (or more than a bit).
Thanks for your great work!
Do you have any idea? Haha
Hmmm..
candid, shamelessated, neutered, compliant, lobotomated.. It's hard to find something adequate, tech, unique, not pompous, not exaggerating. :X
Well Undi, I actually used your code to abliterate some models (I'm still testing this).
At time, it's worst than the normal abliteration.
But this time, it was better.
https://huggingface.co/Nexesenex/pankajmathur_orca_mini_v9_6_1B-instruct-Abliterated-LPL
And so, layer per layer it is, aka. LPL. :D
Lmao, nice! I don't know if you saw but since the github repo got updated to let you choose every weight of the refusal on every layer, try it, you can save a model on the fly.