|
--- |
|
license: cc0-1.0 |
|
datasets: |
|
- Lichess/standard-chess-games |
|
pipeline_tag: text2text-generation |
|
tags: |
|
- chess |
|
--- |
|
# Model card for chessPT |
|
A pretrained Decoder only transformer model for chess move prediction. |
|
|
|
## Intended use |
|
Predict new moves in a chess game based on PGN tokens. |
|
|
|
## Implementation |
|
The model implementation is based on Andrej Karpathy's [nanoGPT](https://github.com/karpathy/nanoGPT) following the webseries "Zero to Hero" on [youtube](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ). |
|
|
|
## Training |
|
You can find the training script in the repositories files under `train.py`. |
|
This also contains the used parameters |
|
```python |
|
context_size = 256 |
|
batch_size = 128 |
|
max_iters = 30_000 |
|
learning_rate = 3e-5 |
|
eval_interval = 100 |
|
eval_iters = 20 |
|
n_embed = 384 |
|
n_layer = 6 |
|
n_head = 6 |
|
dropout = 0.2 |
|
``` |
|
|
|
|
|
[![Visitors](https://api.visitorbadge.io/api/combined?path=https%3A%2F%2Fhuggingface.co%2Fphilipp-zettl%2FchessPT&label=%23%20Visitors&countColor=%23ff8a65&style=flat)](https://visitorbadge.io/status?path=https%3A%2F%2Fhuggingface.co%2Fphilipp-zettl%2FchessPT) |