On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes Paper • 2306.13649 • Published Jun 23, 2023 • 18
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models Paper • 1909.11299 • Published Sep 25, 2019 • 2
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation Paper • 2410.14745 • Published Oct 17, 2024 • 47
Aligning Large Language Models via Self-Steering Optimization Paper • 2410.17131 • Published Oct 22, 2024 • 22
Modulated Intervention Preference Optimization (MIPO): Keep the Easy, Refine the Difficult Paper • 2409.17545 • Published Sep 26, 2024 • 20