SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding Paper • 2402.08983 • Published Feb 14, 2024 • 3
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models Paper • 2401.12242 • Published Jan 20, 2024 • 1
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Paper • 2402.11753 • Published Feb 19, 2024 • 6
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates Paper • 2406.12935 • Published Jun 17, 2024 • 2
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 35
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 66
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 567