SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 8 items β’ Updated 5 days ago β’ 160
CLEAR: Character Unlearning in Textual and Visual Modalities Paper β’ 2410.18057 β’ Published 17 days ago β’ 197
Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions Paper β’ 2410.17655 β’ Published 18 days ago β’ 5
How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold Paper β’ 2410.15002 β’ Published 22 days ago β’ 6
C4AI Aya Expanse Collection Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. β’ 3 items β’ Updated 16 days ago β’ 25
πͺ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos β’ 12 items β’ Updated Aug 18 β’ 192
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper β’ 2410.12705 β’ Published 24 days ago β’ 29
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation Paper β’ 2410.13848 β’ Published 23 days ago β’ 27
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. β’ 6 items β’ Updated 25 days ago β’ 129
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. β’ 45 items β’ Updated Sep 18 β’ 305
CursorCore: Assist Programming through Aligning Anything Paper β’ 2410.07002 β’ Published Oct 9 β’ 13
Aria: An Open Multimodal Native Mixture-of-Experts Model Paper β’ 2410.05993 β’ Published Oct 8 β’ 107
Addition is All You Need for Energy-efficient Language Models Paper β’ 2410.00907 β’ Published Oct 1 β’ 143