How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries Paper • 2402.15302 • Published Feb 23, 2024 • 4 • 1
SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models Paper • 2406.12274 • Published Jun 18, 2024 • 15 • 3
Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions Paper • 2501.01872 • Published Jan 3 • 2
Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions Paper • 2501.01872 • Published Jan 3 • 2
Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models Paper • 2410.12880 • Published Oct 15, 2024 • 3
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper • 2411.06176 • Published Nov 9, 2024 • 45
AI and Safety Collection We published in several top NLP/AI conferences such as ACL, EMNLP, AAAI, ICWSM • 8 items • Updated about 1 month ago • 4
Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models Paper • 2410.12880 • Published Oct 15, 2024 • 3
AI and Safety Collection We published in several top NLP/AI conferences such as ACL, EMNLP, AAAI, ICWSM • 8 items • Updated about 1 month ago • 4