chessPT / README.md
philipp-zettl's picture
Update README.md
ed08b45 verified
|
raw
history blame
836 Bytes
---
license: cc0-1.0
datasets:
- Lichess/standard-chess-games
pipeline_tag: text2text-generation
tags:
- chess
---
# Model card for chessPT
A pretrained Decoder only transformer model for chess move prediction.
## Intended use
Predict new moves in a chess game based on PGN tokens.
## Implementation
The model implementation is based on Andrej Karpathy's [nanoGPT](https://github.com/karpathy/nanoGPT) following the webseries "Zero to Hero" on [youtube](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ).
## Training
You can find the training script in the repositories files under `train.py`.
This also contains the used parameters
```python
context_size = 256
batch_size = 128
max_iters = 30_000
learning_rate = 3e-5
eval_interval = 100
eval_iters = 20
n_embed = 384
n_layer = 6
n_head = 6
dropout = 0.2
```