chessPT / README.md
philipp-zettl's picture
Update README.md
ed08b45 verified
|
raw
history blame
836 Bytes
metadata
license: cc0-1.0
datasets:
  - Lichess/standard-chess-games
pipeline_tag: text2text-generation
tags:
  - chess

Model card for chessPT

A pretrained Decoder only transformer model for chess move prediction.

Intended use

Predict new moves in a chess game based on PGN tokens.

Implementation

The model implementation is based on Andrej Karpathy's nanoGPT following the webseries "Zero to Hero" on youtube.

Training

You can find the training script in the repositories files under train.py. This also contains the used parameters

context_size = 256
batch_size = 128
max_iters = 30_000
learning_rate = 3e-5
eval_interval = 100
eval_iters = 20
n_embed = 384
n_layer = 6
n_head = 6
dropout = 0.2