lemonteaa
/

nanogpt-speedrun

Model card Files Files and versions Community

NanoGPT Speedrun

Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).

Run Info

baseline/

Run on lightning cloud, using one L40S
Batch size set to 32
VRAM usage: 26.95GB (25698MB reported in nvidia-smi)
4 seconds per step, total 3200 steps
Checkpoint saved every 320 steps

Training loss

To experimentally check the neural scaling law:

(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)

Demo

Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo

(WIP)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for lemonteaa/nanogpt-speedrun

Base model

openai-community/gpt2

Finetuned

(1389)

this model

Dataset used to train lemonteaa/nanogpt-speedrun

Space using lemonteaa/nanogpt-speedrun 1