--- license: apache-2.0 language: - en --- # ChessCLIP ChessCLIP is a CLIP model trained to align (board, action) representation with natural language and calculate the similarity in Chess game. ## Model Details - **Language(s)**: English - **License**: Apache 2.0 - **Model Description**: A CLIP model for chess. # Quick Start ```bash git clone https://github.com/waterhorse1/ChessGPT ``` Clone our codebase and install all dependencies according to our README. ## Inference ```python import sys sys.path.append('./chessclip/src') import torch import io import chess.pgn import numpy as np from data.chessclip_data.feature_converter import get_lc0_input_planes_tf from data.chessclip_data.pgn_base import generate_examples_from_game_no_comment from open_clip.factory import get_tokenizer, load_checkpoint # init model_name = 'chessclip-quickgelu' model = open_clip.create_model(model_name, pretrained='openai') tokenizer = get_tokenizer(model_name) # load model load_checkpoint(model, './ChessCLIP/epoch_last.pt') # check parameters model.eval() context_length = model.text.context_length vocab_size = model.text.vocab_size print("Model parameters:", f"{np.sum([int(np.prod(p.shape)) for p in model.parameters()]):,}") print("Context length:", context_length) print("Vocab size:", vocab_size) # generate board/action embedding based on pgn string def generate_representation_for_final(pgn): game = chess.pgn.read_game(io.StringIO(pgn)) data = list(generate_examples_from_game_no_comment(game))[-1] for key in data.keys(): data[key] = np.array(data[key]) board = get_lc0_input_planes_tf(data).numpy() action = data['probs'] return board, action # Prepare input prompt = "Black plays Sicilian Defense" pgn_str = '1. e4 c5' board, action = generate_representation_for_final(pgn_str) text_tokens = tokenizer([prompt]) image_input = torch.from_numpy(np.stack([board], axis=0)) action_input = torch.from_numpy(np.stack([action], axis=0)) # infer with torch.no_grad(): image_features = model.encode_image((image_input, action_input)).float() text_features = model.encode_text(text_tokens).float() image_features /= image_features.norm(dim=-1, keepdim=True) # n * dim text_features /= text_features.norm(dim=-1, keepdim=True)# m * dim similarity = text_features.cpu().numpy() @ image_features.cpu().numpy().T # m * n print(similarity) ``` ## Limitations "ChessCLIP," like other CLIP-based models, has certain limitations that need to be taken into consideration. For instance, the model may produce incorrect similarities, especially when faced with complex, ambiguous, or language inputs that fall outside its training data. We highly appreciate contributions from individuals and organizations to enhance the model's performance and stability. Specifically, we welcome annotated data, such as annotated PGN (Portable Game Notation), which can be utilized to train a more robust and reliable CLIP model. ## Benchmark Please refer to our [paper](https://together.xyz) and [code](https://github.com/waterhorse1/ChessGPT)for benchmark results.