Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 22 items • Updated 1 day ago • 92
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 346
Qwen/Qwen2.5 FP8 Collection Collection of Qwen2.5 models quantized into FP8 • 7 items • Updated Sep 18 • 1
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jun 28 • 14
C4AI Command R Collection C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weigh • 4 items • Updated Aug 30 • 19
C4AI Command R Plus Collection C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities. • 4 items • Updated Aug 30 • 53
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 4 items • Updated Aug 6 • 50
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations Paper • 2410.18860 • Published 17 days ago • 7
C4AI Aya Expanse Collection Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 3 items • Updated 17 days ago • 25
MiniPLM: Knowledge Distillation for Pre-Training Language Models Paper • 2410.17215 • Published 19 days ago • 12
QQQ: Quality Quattuor-Bit Quantization for Large Language Models Paper • 2406.09904 • Published Jun 14 • 1
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated 26 days ago • 19
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 26 days ago • 129