philipp-zettl
/

chessPT

Text2Text Generation

Model card Files Files and versions Community

chessPT / README.md

philipp-zettl's picture

Update README.md

ed08b45 verified about 2 months ago

|

836 Bytes

	---
	license: cc0-1.0
	datasets:
	- Lichess/standard-chess-games
	pipeline_tag: text2text-generation
	tags:
	- chess
	---
	# Model card for chessPT
	A pretrained Decoder only transformer model for chess move prediction.

	## Intended use
	Predict new moves in a chess game based on PGN tokens.

	## Implementation
	The model implementation is based on Andrej Karpathy's [nanoGPT](https://github.com/karpathy/nanoGPT) following the webseries "Zero to Hero" on [youtube](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ).

	## Training
	You can find the training script in the repositories files under `train.py`.
	This also contains the used parameters
	```python
	context_size = 256
	batch_size = 128
	max_iters = 30_000
	learning_rate = 3e-5
	eval_interval = 100
	eval_iters = 20
	n_embed = 384
	n_layer = 6
	n_head = 6
	dropout = 0.2
	```