InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning Paper • 2502.11573 • Published 9 days ago • 8
ScaleQuest Collection We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/ • 9 items • Updated Jan 7 • 6
Running 529 529 Open Source Ai Year In Review 2024 😻 What happened in open-source AI this year, and what’s next?
Running 102 102 TxT360: Trillion Extracted Text 📖 Create a large, deduplicated dataset for LLM pre-training