Spaces:

nanotron
/

ultrascale-playbook

Running

App Files Files Community

Questions?

#84

pinned

by nouamanetazi HF staff - opened 3 days ago

Discussion

nouamanetazi

Nanotron Research org 3 days ago

If you have any questions about the content of the blog, feel free to ask here!

nouamanetazi pinned discussion 3 days ago

ViditOstwal

3 days ago

Hi, I am just a nobbie, trying to learn about the training LLM models.
It might be dumb from me to ask this.
But can anyone tell me how the batch size effects the throughput (tokens generated /sec) also how does having a larger batch size will tend to make less use of each training token rendering convergence slower and potentially wasting compute.

Arandott

1 day ago

Does the value in this graph really make sense?[40, 180, 320, 460]^(T) @ [20, 40]...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment