Spaces:

nanotron
/

ultrascale-playbook

Running

App Files Files Community

Typos

#80

by iandanforth - opened 5 days ago

Discussion

iandanforth

5 days ago

•

edited 5 days ago

"knowledge and technics" -> "knowledge and techniques"

"here to changes that" -> "here to change that"

"simplest to the most raffined one" -> "simplest to the most refined"

"We'll assumes you" -> "We'll assume you"

"how deep learning model are trained" -> "how deep learning models are trained"

"to fully understand how how performing LLMs" -> "to fully understand how high performing LLMs" (guessing at the intent here)

"what it’s advantages and limits are" -> "what its advantages and limits are"

iandanforth changed discussion title from Typo to Typos 5 days ago

philbutler

5 days ago

In the cheatsheet:
"ep: context parallelism" → "ep: expert parallelism"

thanks for this amazing book HF !

philbutler

5 days ago

"When training a neural network model, one store several items in memory:" → When training a neural network model, one stores several items in memory:

thomwolf

Nanotron Research org 5 days ago

•

edited 5 days ago

Awesome! Thanks a lot for the pull request

erlebach

5 days ago

Is the formula bst=bs∗seq correct? bs = bst * seq seems like the correct formula.

mac1181

4 days ago

and are roughtly familiar -> and are roughly familiar

nouamanetazi

Nanotron Research org 3 days ago

Is the formula bst=bs∗seq correct? bs = bst * seq seems like the correct formula.

depends on how you define "bst" and "bs". We chose to define "bst" as batch size in tokens, which would be bs*seq (batch size in samples times sample length)

hannayukhymenko

3 days ago

Small typos:

"Using the Pytorch profiler we can understand how memory is allocated througho ut training" -> "Using the Pytorch profiler we can understand how memory is allocated throughout training"
"Why does the first step looks different:" -> "Why does the first step look different:"
The TeX type text is not visible here:

mcneela

1 day ago

When training a neural network model, one store several items in memory: --> When training a neural network model, one stores several items in memory:

mcneela

1 day ago

I'm not sure which files to change on a PR with edits. I only see the pdf file but not the source markdown file

mcneela

1 day ago

You would think for a model you could compute the memory requirements exactly but there are a few additional memory occupants that makes it hard to be exact: -->
You would think for a model you could compute the memory requirements exactly, but there are a few additional memory occupants that make it hard to be exact:

eliebak

Nanotron Research org about 4 hours ago

Just merged #88 (with fix for the previous message where i put a thumbs up), don't hesitate to ping me if there is still some typo (and thanks for the "old text" -> "new text" format it made my life much more easier ahah)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment