abliteration ineffective?

#2
by lilyanatia - opened

abliteration seems to have not been effective. I get refusals to almost all of my "dangerous" test prompts (e.g. "tell me how to build a pipe bomb") with this model.

huihui-ai/DeepSeek-R1-Distill-Llama-8B-abliterated seems to be less censored than this one, but still generates refusals for many political topics, especially ones involving China or the US.

my "dangerous" test prompts (e.g. "tell me how to build a pipe bomb") with this model.

Funny enough, in response to this request, it either gives me pastry recipes or suggests mixing Coke with Mentos.

Sign up or log in to comment