argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 5.03k • 130
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement Paper • 2403.15042 • Published Mar 22, 2024 • 26
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices Paper • 2406.08451 • Published Jun 12, 2024 • 24
instruction-pretrain/ft-instruction-synthesizer-collection Viewer • Updated Dec 2, 2024 • 249k • 1.15k • 60
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28, 2024 • 98
Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Paper • 2405.18952 • Published May 29, 2024 • 10
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 136
Enhancing Human-Like Responses in Large Language Models Paper • 2501.05032 • Published 17 days ago • 49
Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B Viewer • Updated 1 day ago • 250k • 31 • 2