view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • 2 days ago • 8
Diffusion Model Alignment Using Direct Preference Optimization Paper • 2311.12908 • Published Nov 21, 2023 • 48
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Paper • 2311.13600 • Published Nov 22, 2023 • 43
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models Paper • 2302.08453 • Published Feb 16, 2023 • 9