3 6 14

Cody Steinmetz PRO

codys12

AI & ML interests

None yet

Recent Activity

upvoted an article 10 days ago

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

updated a Space 11 days ago

codys12/NetCom-to-WooComerce

published a Space 11 days ago

codys12/NetCom-to-WooComerce

View all activity

Organizations

codys12's activity

upvoted an article 10 days ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 223

updated a Space 11 days ago

NetCom To WooComerce

🏃

CSV conversion tool

published a Space 11 days ago

NetCom To WooComerce

🏃

CSV conversion tool

upvoted a paper 19 days ago

Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published Jan 28 • 36

liked a model 28 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 7 days ago • 4.36M • • 10.6k

liked a dataset 28 days ago

open-thoughts/OpenThoughts-114k

Viewer • Updated 11 days ago • 228k • 112k • 626

upvoted a collection 6 months ago

Mamba2-In-Llama3

Collection

Mamba2 distilled from Llama3 8B instruct. The Mamba in the Llama: Distilling and Accelerating Hybrid Models (https://arxiv.org/abs/2408.15237). • 4 items • Updated Sep 9, 2024 • 2

upvoted a paper 6 months ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 40

updated a dataset 6 months ago

codys12/LlamaKD

Viewer • Updated Aug 21, 2024 • 4M • 16.1k • 13

updated a model 7 months ago

codys12/Meta-Llama-3.1-405B-bnb-4bit

Text Generation • Updated Aug 14, 2024 • 4

New activity in unsloth/Meta-Llama-3.1-405B-bnb-4bit 7 months ago

Why did num_key_value_heads change from 16 to 8?

#1 opened 7 months ago by

codys12

liked a dataset 7 months ago

codys12/LlamaKD

Viewer • Updated Aug 21, 2024 • 4M • 16.1k • 13

updated 4 datasets 7 months ago

updated 3 models 7 months ago

codys12/dshybrid

Text Generation • Updated Jul 26, 2024 • 6

codys12/dshybridempty

Text Generation • Updated Jul 26, 2024 • 49

codys12/dshybrid-empty

Updated Jul 26, 2024

updated a model 8 months ago

codys12/BitJamba-init

Text Generation • Updated Jul 4, 2024 • 7