Anthonny Olime's picture

Anthonny Olime

Aviv-anthonnyolime

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

upvoted a collection 3 days ago

updated a collection 7 days ago

View all activity

Organizations

Aviv-anthonnyolime's activity

upvoted a paper 3 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 5 days ago • 140

upvoted a collection 3 days ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 3 days ago • 222

upvoted 2 articles 9 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

13 days ago

• 680

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

By

•

9 days ago

• 31

upvoted 2 collections 11 days ago

image

233 items • Updated 1 day ago • 3

Papers - Google

53 items • Updated Nov 2, 2024 • 2

upvoted a paper 11 days ago

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Paper • 2403.18818 • Published Mar 27, 2024 • 26

upvoted a paper 12 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 18 days ago • 306

upvoted an article 12 days ago

Article

PEFT: Parameter-Efficient Fine-Tuning Methods for LLMs

By

•

16 days ago

• 12

upvoted 2 papers 20 days ago

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 77

LaDiMo: Layer-wise Distillation Inspired MoEfier

Paper • 2408.04278 • Published Aug 8, 2024 • 1

upvoted 2 papers 24 days ago

MoH: Multi-Head Attention as Mixture-of-Head Attention

Paper • 2410.11842 • Published Oct 15, 2024 • 21

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 26 days ago • 273

upvoted 2 collections about 1 month ago

Phi-4

Phi-4 small language model. • 2 items • Updated Jan 8 • 46

Cosmos

The collection of Cosmos models • 31 items • Updated 23 days ago • 255

upvoted 3 papers about 1 month ago

TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment

Paper • 2501.00522 • Published Dec 31, 2024 • 1

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

Paper • 2501.01904 • Published Jan 3 • 31

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3 • 42

upvoted an article about 1 month ago

Article

Fine-tune SmolLM's on custom synthetic data

By

•

Jan 5

• 17

upvoted a paper about 1 month ago

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 6