garrett galloway's picture

7 1 23

garrett galloway PRO

RecViking

·

recreationalviking

AI & ML interests

None yet

Recent Activity

new activity 6 days ago

microsoft/phi-4:Phi-4 with Tools

liked a dataset 11 days ago

GAIR/LIMR

liked a model 19 days ago

microsoft/OmniParser-v2.0

View all activity

Organizations

RecViking's activity

New activity in microsoft/phi-4 6 days ago

Phi-4 with Tools

#28 opened about 2 months ago by

liked a dataset 11 days ago

GAIR/LIMR

Viewer • Updated 14 days ago • 1.39k • 306 • 20

liked a model 19 days ago

microsoft/OmniParser-v2.0

Image-Text-to-Text • Updated 14 days ago • 7.43k • 1.06k

reacted to lewtun's post with 🔥 about 1 month ago

Post

10198

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1

5 replies

·

commented a paper 9 months ago

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Paper • 2406.10210 • Published Jun 14, 2024 • 77 •

upvoted a paper 9 months ago

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Paper • 2405.20541 • Published May 30, 2024 • 22

liked a Space 9 months ago

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

liked a Space about 1 year ago

Daily Papers

Complete list of past Daily Papers

liked 2 models about 1 year ago

NobodyExistsOnTheInternet/Llama-2-70b-x8-MoE-clown-truck

Text Generation • Updated Jan 23, 2024 • 1.78k • 8

01-ai/Yi-VL-34B

Image-Text-to-Text • Updated Jun 26, 2024 • 132 • 262

liked 2 models over 1 year ago

amazon/FalconLite

Text Generation • Updated Nov 17, 2023 • 566 • 170

meta-llama/Llama-2-70b-chat-hf

Text Generation • Updated Apr 17, 2024 • 34k • • 2.18k

New activity in OpenAssistant/falcon-40b-sft-top1-560 over 1 year ago

Issue with multi GPU inference.

#1 opened almost 2 years ago by

New activity in open-llm-leaderboard/open_llm_leaderboard almost 2 years ago

Add license/commercial use column?

#12 opened almost 2 years ago by

New activity in HuggingFaceH4/starchat-alpha almost 2 years ago

CPU bound when loaded on GPU?

#6 opened almost 2 years ago by

liked a model almost 2 years ago

bigcode/tiny_starcoder_py

Text Generation • Updated Jun 1, 2023 • 3.28k • • 73

New activity in HuggingFaceH4/starchat-alpha almost 2 years ago

General tips around inference speed?

#3 opened almost 2 years ago by

liked a model almost 2 years ago

HuggingFaceH4/starchat-alpha

Text Generation • Updated Jun 8, 2023 • 2.07k • 233

New activity in OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 almost 2 years ago

Can the model be used for commercial purposes?

#11 opened almost 2 years ago by

liked a model almost 2 years ago

bigcode/starcoder

Text Generation • Updated Oct 8, 2024 • 17.4k • • 2.86k