Elie Bakouch's picture

Elie Bakouch

eliebak

·

AI & ML interests

Training LLM's @ 🤗

Recent Activity

new activity about 1 hour ago

nanotron/ultrascale-playbook:Few Errors

new activity about 1 hour ago

nanotron/ultrascale-playbook:Typos

new activity about 2 hours ago

nanotron/ultrascale-playbook:Link to torchao is broken

View all activity

Organizations

Posts 1

Post

1733

Wow, impressive 340B model by nvidia with a nice permissive license! 🚀 The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! 👀

nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911

Articles 5

Article

185

Open R1: Update #2

View all Articles

Collections 2

Papers 3

arxiv:2502.02737

arxiv:2412.01152

arxiv:2405.18392

models 12

eliebak/SmolLM-360M-Instruct-Q8_0-GGUF

Updated Aug 13, 2024 • 9

eliebak/the-tokenizer-v1.5

Updated Jul 4, 2024

eliebak/the-tokenizer-v2

Updated Jun 17, 2024

eliebak/wsd_124M_300B_fw

Text Generation • Updated Jun 11, 2024 • 74

eliebak/wsd_124M_300B_edu

Text Generation • Updated Jun 11, 2024 • 78

eliebak/wsd_124M_150B_edu

Text Generation • Updated Jun 11, 2024 • 76

eliebak/wsd_124M_150B_fw

Text Generation • Updated Jun 11, 2024 • 76

eliebak/cos_124M_150B_fw

Text Generation • Updated Jun 9, 2024 • 58

eliebak/cos_124M_150B_edu

Text Generation • Updated Jun 9, 2024 • 55

eliebak/debug-cos-100B

Text Generation • Updated Jun 8, 2024 • 53

datasets 3

eliebak/very-smollm-corpus

Viewer • Updated Sep 9, 2024 • 4.58M • 23 • 2

eliebak/Buzz_wo_chatml_format

Viewer • Updated Jun 25, 2024 • 31.2M • 164 • 1

eliebak/Buzz_chatml_format

Viewer • Updated Jun 15, 2024 • 31.2M • 472