Monet: Mixture of Monosemantic Experts for Transformers Paper • 2412.04139 • Published Dec 5, 2024 • 12
Inference-Time Intervention (ITI) Models Collection A collection of Llama models with Inference-Time Intervention (Li et al.) applied to them. Codebase: https://github.com/likenneth/honest_llama • 6 items • Updated Aug 24, 2024 • 3