Model Card for Model ID

Model Details

Model Description

This is a chess-playing GPT2. It was finetuned from the austindavis/chessGPT_d12 model, but uses a 3-tokens-per-ply tokenization scheme rather than the variable-length tokenization from chessGPT_d12 (where promotion tokens interrupt the otherwise consistent 2-token-per-ply structure). The model was finetuned using the Feb 2023 Lichess UCI dataset. Training progress and configurations are saved to the Weights & Biases run at: https://wandb.ai/austinleedavis/chess_public/runs/itgnfae4. Although 27 epochs were completed, the version here is captured from epoch 20 (step 399,825) because validation loss skyrocketed during epoch 25.

Uses

This model requires a custom version of the Tokenizers library. The customization adds a normalizer Append which adds a space at the end of every input sequence. To install the custom tokenizer, run:

pip install git+https://github.com/austindavis/tokenizers.git#subdirectory=bindings/python

Without this customization, you can still run the model, but you must remove the following lines from tokenizer.json (lines 43 through 46), and you must manually add a space to the end of every input sequence:

"normalizer": {
    "type": "Append",
    "append": " "
  },

Here's a nice lambda function which facilitates decoding into valid UCI by removing the extra spaces that are added by the tokenizer:

tokenizer = PreTrainedTokenizerFast.from_pretrained("austindavis/chessGPT2")
decode = lambda ids: tokenizer.decode(ids).replace("  ", "_").replace(" ", "").replace("_", " ")

Then, you can encode/decode as follows:

>>> decode(tokenizer("e2e4")['input_ids'])
'<|startoftext|>e2e4 '