• The GPT -2 model was trained on the BookCorpus dataset for 60K steps.
  • No position embedding was used (NoPE).
  • Here is the wandb report
  • This is for educational purposes only.
Downloads last month
119
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train arun-AiBharat/gpt-2-bookcorpus