SurveyX: Academic Survey Automation via Large Language Models Paper ā¢ 2502.14776 ā¢ Published 5 days ago ā¢ 80
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. ā¢ 29 items ā¢ Updated 18 days ago ā¢ 199
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Paper ā¢ 2410.01036 ā¢ Published Oct 1, 2024 ā¢ 15
HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors Paper ā¢ 2408.06019 ā¢ Published Aug 12, 2024 ā¢ 15
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction Paper ā¢ 2409.18124 ā¢ Published Sep 26, 2024 ā¢ 32
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. ā¢ 27 items ā¢ Updated 20 days ago ā¢ 54
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 ā¢ 223
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. ā¢ 45 items ā¢ Updated Nov 28, 2024 ā¢ 529
ReMamba: Equip Mamba with Effective Long-Sequence Modeling Paper ā¢ 2408.15496 ā¢ Published Aug 28, 2024 ā¢ 11
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper ā¢ 2408.15237 ā¢ Published Aug 27, 2024 ā¢ 39
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design Paper ā¢ 2408.12503 ā¢ Published Aug 22, 2024 ā¢ 24
Controllable Text Generation for Large Language Models: A Survey Paper ā¢ 2408.12599 ā¢ Published Aug 22, 2024 ā¢ 64
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models ā¢ 2 items ā¢ Updated 8 days ago ā¢ 84