Chmielewski's picture

Chmielewski

Eryk-Chmielewski

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 19 hours ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

upvoted a paper about 19 hours ago

RL + Transformer = A General-Purpose Problem Solver

upvoted a paper about 19 hours ago

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

View all activity

Organizations

Eryk-Chmielewski's activity

upvoted 13 papers about 19 hours ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published 13 days ago • 48

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 11 days ago • 19

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

Paper • 2501.15570 • Published 8 days ago • 22

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 7 days ago • 23

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 9 days ago • 49

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published 7 days ago • 20

Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published 6 days ago • 29

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 6 days ago • 89

Large Language Models Think Too Fast To Explore Effectively

Paper • 2501.18009 • Published 5 days ago • 20

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published 4 days ago • 22

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 4 days ago • 40

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published 3 days ago • 29

s1: Simple test-time scaling

Paper • 2501.19393 • Published 3 days ago • 54

liked a model about 20 hours ago

arcee-ai/Virtuoso-Medium-v2

Text Generation • Updated 6 days ago • 559 • 33

liked 5 models about 21 hours ago

bytedance-research/UI-TARS-7B-DPO

Image-Text-to-Text • Updated 10 days ago • 18.2k • 112

HKUSTAudio/Llasa-3B

Text-to-Speech • Updated 1 day ago • 6.02k • 404

Almawave/Velvet-14B

Text Generation • Updated 4 days ago • 1.57k • 102

m-a-p/YuE-s1-7B-anneal-en-cot

Text Generation • Updated 5 days ago • 16.4k • 318

hexgrad/Kokoro-82M

Text-to-Speech • Updated 2 days ago • 108k • 2.74k

upvoted a collection 4 days ago

HIGGS

Models prequantized with [HIGGS](https://arxiv.org/abs/2411.17525) zero-shot quantization. Requires the latest `transformers` to run. • 17 items • Updated Dec 24, 2024 • 6