475 96 940

Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

pszemraj

AI & ML interests

metallic intuition

Recent Activity

upvoted a paper about 9 hours ago

Thus Spake Long-Context Large Language Model

liked a model 1 day ago

HuggingFaceTB/SmolLM2-1.7B-Instruct-16k

upvoted a paper 2 days ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

View all activity

Organizations

pszemraj's activity

upvoted a paper about 9 hours ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published 1 day ago • 51

upvoted 2 papers 2 days ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published 5 days ago • 14

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 5 days ago • 91

upvoted 2 papers 5 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 8 days ago • 26

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 6 days ago • 143

upvoted 7 papers 8 days ago

An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

Paper • 2502.09056 • Published 13 days ago • 30

upvoted 2 papers 15 days ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 20 days ago • 53

Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published 19 days ago • 30

upvoted 2 collections about 1 month ago

NeMo Curator - Classifier Models

Collection

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated 11 days ago • 16

SmolVLM 256M & 500M

Collection

Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 5 days ago • 69

upvoted an article about 1 month ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 142

upvoted a collection about 1 month ago

Deita

Collection

14 items • Updated May 20, 2024 • 12

upvoted 2 papers about 1 month ago

An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published Jan 9 • 37

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 55