NanoGPT Speedrun

Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).

Run Info

baseline/

  • Run on lightning cloud, using one L40S
  • Batch size set to 32
  • VRAM usage: 26.95GB (25698MB reported in nvidia-smi)
  • 4 seconds per step, total 3200 steps
  • Checkpoint saved every 320 steps

Training loss

To experimentally check the neural scaling law:

baseline/analysis/loss_plot2.png

(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)

Demo

Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo

(WIP)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for lemonteaa/nanogpt-speedrun

Finetuned
(1389)
this model

Dataset used to train lemonteaa/nanogpt-speedrun

Space using lemonteaa/nanogpt-speedrun 1