12 17 1

Yi Cui

onekq

https://onekq.ai

AI & ML interests

Benchmark, Code Generation Model

Recent Activity

updated a model 1 day ago

onekq-ai/Bespoke-Stratos-7B-bnb-4bit

published a model 1 day ago

onekq-ai/Bespoke-Stratos-7B-bnb-4bit

updated a model 1 day ago

onekq-ai/Bespoke-Stratos-32B-bnb-4bit

View all activity

Articles

Does Daily Software Engineering Work Need Reasoning Models?

Sep 24, 2024

• 5

All LLMs Write Great Code, But Some Make (A Lot) Fewer Mistakes

Sep 12, 2024

• 4

Organizations

onekq's activity

updated a model 1 day ago

onekq-ai/Bespoke-Stratos-7B-bnb-4bit

Text Generation • Updated 1 day ago

published a model 1 day ago

onekq-ai/Bespoke-Stratos-7B-bnb-4bit

Text Generation • Updated 1 day ago

updated a model 1 day ago

onekq-ai/Bespoke-Stratos-32B-bnb-4bit

Text Generation • Updated 1 day ago

published a model 1 day ago

onekq-ai/Bespoke-Stratos-32B-bnb-4bit

Text Generation • Updated 1 day ago

posted an update 1 day ago

Post

1272

So 🐋DeepSeek🐋 hits the mainstream media. But it has been a star in our little cult for at least 6 months. Its meteoric success is not overnight, but two years in the making.

To learn their history, just look at their 🤗 repo https://huggingface.co/deepseek-ai

* End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture
* June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral
* September, v2.5 surpassed GPT 4o mini
* December, v3 surpassed GPT 4o
* Now R1 surpassed o1

Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar.

* Minimax-01
* Kimi k1.5
* Doubao 1.5 pro

reacted to clem's post with 🔥 1 day ago

Post

1616

The 🐳 just crossed 10,000 followers on HF

https://huggingface.co/deepseek-ai

replied to their post 2 days ago

My conclusion is the same. The R1 paper already reported lower success rate of the distilled models. This is not surprising since we cannot expect the same outcomes out of a much smaller model.

Here is the problem. The small models released by frontier labs are always generic, i.e. decent but lower performance than the flagship model on every benchmark. But we GPU deplorables often want a specialized model which is excellent on only one thing, hence the disappointment.

I guess we will have to help ourselves on this one. Distill an opinionated dataset from the flagship model to a small model of your choice, then hill climb the benchmark you care about.

updated a Space 3 days ago

Running

🥇

WebApp1K Models Leaderboard

replied to their post 3 days ago

1000% agree.

Also reasoning models sure spit out lots of tokens. The same benchmark cost 4x or 5x the money and time to run than regular LLMs. Exciting time for inference players.

Have you tried the distilled models of R1(Qwen and Llama)?

replied to their post 4 days ago

Also the velocity of progress. I have wanted to learn Monte Carlo Tree Search and process rewards etc. and haven't got the time. I guess now I can skip them 🤗

posted an update 5 days ago

Post

2560

This is historical. 🎉

DeepSeek 🐋R1🐋 surpassed OpenAI 🍓o1🍓 on the dual leaderboard. What a year for the open source!

onekq-ai/WebApp1K-models-leaderboard

posted an update 6 days ago

Post

4556

🐋DeepSeek 🐋 is the real OpenAI 😯

6 replies

updated a collection 12 days ago

QLora-ready Coding Models

Collection

For Finetuning. GPU is needed for both quantization and inference. • 29 items • Updated 12 days ago

updated a model 12 days ago

onekq-ai/Sky-T1-32B-Preview-bnb-4bit

Text Generation • Updated 12 days ago • 298

posted an update 12 days ago

Post

1643

Qwen 2.5 Coder 32b is a dime among nickels. Amazing performance for its size, so much so it earns a spot in the duo leaderboard. The day of small models is here.

onekq-ai/WebApp1K-models-leaderboard
Qwen/Qwen2.5-Coder-32B-Instruct