|
# NBA Predictions |
|
|
|
This repo contains AI model code and weights which predicts the outcome of NBA games. Its output represents the chance that a given point spread will occur. |
|
|
|
The model requires 8 players on the home and away teams, plus their ages, as input. It will then output probabilities for each point spread between -20 and +20 points, from the home team's point of view. |
|
|
|
For example, the following text and chart shows the model predicting the home team with a 77% chance to win and a 14% chance of winning by 20 or more points. This kind of chart is indicative of a dominant team playing at home. Most games will have more of a bell curve shape to them. |
|
|
|
![NBA prediction graph](prediction.png) |
|
|
|
## Installation |
|
|
|
I recommend installing Python 3.11.8, as that is what the repo was written / tested in. The code will likely work with most recent versions of Python, though. |
|
|
|
Once you have Python installed, run `pip install -r requirements.txt`. It will take a while to install dependencies if you don't already have PyTorch cached. |
|
|
|
## Usage |
|
|
|
The `example.ipynb` notebook shows how to use the model to predict the final game of the 2023-24 NBA season - a game between the Dallas Mavericks and Boston Celtics. It will output the chart above. |
|
|
|
To change the players and their ages, you must reference the `player_tokens.csv` and `age_tokens.csv` files. |
|
|
|
For example, if you wanted to subtract Kristaps Porzingis from Boston's team and swap who was home / away, you would take the token representing Porzingis `4416` out of the `home_team_tokens` list, and replace him with, say, Payton Pritchard `4999`. You would then have to look up Pritchard's age (26), find the corresponding age token in `age_tokens.csv`, which is `11`, and replace Porzingis' age token (which is the second to last token). |
|
|
|
To swap home and away, you could replace the variables containing all of the player and age tokens, or just set the `swap_home_away` variable to `True`. The results are as follows: |
|
|
|
![NBA Finals prediction without Porzingis](porzingis-swapped-for-pritchard.png) |
|
|
|
As you can see, Dallas' win probability improved from 23% to 35%, and their chance of being blown out by 20+ points decreased from 14% to 10%. Clearly, the model thinks Porzingis is important to the Celtics' chances, but still considers Boston to be the superior team without him. |
|
|
|
## Training Process |
|
|
|
I downloaded data from stats.nba.com using the [https://github.com/swar/nba_api](swar/nba_api) package to get information on minutes played, game outcomes, and a few other dimensional elements to make everything fit together. Then, I ran a custom PyTorch training loop to train the model(s) on their chosen loss objective (spread, money line, or spread probability). |