Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper • 2412.02980 • Published 21 days ago • 12
PERSONA: A Reproducible Testbed for Pluralistic Alignment Paper • 2407.17387 • Published Jul 24 • 18
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17 • 50