Post
594
Version 0.2a of ChessPT is currently training.
I decided to wait with the actual v1.0 until I have a better understanding where I want to go and successfully trained the first fine tune.
I'm playing around with a loss that is highly influenced by the idea of reinforcement.
Basically I'm punishing the model for generating invalid PGN strings.
The current approach sets on simplicity
GPT-4o helped me with the implementation. I'm expecting some errors in the implementation.
The training should finish in somewhat 14h, I will upload the new weights then.
But I still need to run extensive tests on this loss before I can happily call it v0.2 ✌️
BTW, I'm also building a space for the model which will be published tonight after adding descriptions and a nice interface. ♟️
philipp-zettl/chessPT
philipp-zettl/ChessPT
I decided to wait with the actual v1.0 until I have a better understanding where I want to go and successfully trained the first fine tune.
I'm playing around with a loss that is highly influenced by the idea of reinforcement.
Basically I'm punishing the model for generating invalid PGN strings.
The current approach sets on simplicity
-2: wrong characters in output
-1: invalid PGN string, but valid charset
0: valid PGN string, incl. valid moves
GPT-4o helped me with the implementation. I'm expecting some errors in the implementation.
The training should finish in somewhat 14h, I will upload the new weights then.
But I still need to run extensive tests on this loss before I can happily call it v0.2 ✌️
BTW, I'm also building a space for the model which will be published tonight after adding descriptions and a nice interface. ♟️
philipp-zettl/chessPT
philipp-zettl/ChessPT