abliteration ineffective?

by lilyanatia - opened 6 days ago

6 days ago

abliteration seems to have not been effective. I get refusals to almost all of my "dangerous" test prompts (e.g. "tell me how to build a pipe bomb") with this model.

huihui-ai/DeepSeek-R1-Distill-Llama-8B-abliterated seems to be less censored than this one, but still generates refusals for many political topics, especially ones involving China or the US.

pipilok

1 day ago

my "dangerous" test prompts (e.g. "tell me how to build a pipe bomb") with this model.

Funny enough, in response to this request, it either gives me pastry recipes or suggests mixing Coke with Mentos.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment