-
Fine-Tuning Language Models from Human Preferences
Paper • 1909.08593 • Published • 3 -
Transforming and Combining Rewards for Aligning Large Language Models
Paper • 2402.00742 • Published • 11 -
Leverage the Average: an Analysis of KL Regularization in RL
Paper • 2003.14089 • Published • 2 -
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Paper • 2404.01258 • Published • 10
Collections
Discover the best community collections!
Collections including paper arxiv:1909.08593
-
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Paper • 1801.03924 • Published • 2 -
Fine-Tuning Language Models from Human Preferences
Paper • 1909.08593 • Published • 3 -
Training Verifiers to Solve Math Word Problems
Paper • 2110.14168 • Published • 4 -
Learning Transferable Visual Models From Natural Language Supervision
Paper • 2103.00020 • Published • 11
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 39 -
Dueling RL: Reinforcement Learning with Trajectory Preferences
Paper • 2111.04850 • Published • 2
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 57 -
WARM: On the Benefits of Weight Averaged Reward Models
Paper • 2401.12187 • Published • 17 -
RewardBench: Evaluating Reward Models for Language Modeling
Paper • 2403.13787 • Published • 21 -
DreamReward: Text-to-3D Generation with Human Preference
Paper • 2403.14613 • Published • 35