NFL 2025 Season - Zoltar Looks at Predicting Over-Under Scores Using Linear Regression

Zoltar is my NFL football prediction system. Zoltar uses a combination of classical statistics, a neural network, and a form of reinforcement learning.

My basic Zoltar predicts game results against Las Vegas point spread data. For example, if the Chiefs are playing the Rams, the Vegas line might be “Chiefs -7.5” meaning the Chiefs must win by more than 7.5 points. Zoltar will predict which team will win and by how many points, such as “Chiefs will win by 3 points”. This would result in a recommendation to bet on the underdog Rams.

I decided to modify Zoltar to predict the total number of points that will be scored in a game. This can be used to place over-under bets. For example, in the hypothetical Chiefs vs. Rams game, the Vegas over-under line might be 44.5 points. You can bet that the total number of points scored by both teams will either be over 44.5 points or under 44.5 points.

The first step was to generate data. After quite a few hours, I used Zoltar to create a data file that looks like:

# TotalPointsData2024.txt
# week, visitorID, visitorRating, homeID, homeRating, totalPts
#
1 24 ravens 2105 8 chiefs 2058 47
1 19 packers 2011 12 eagles 2058 63
1 28 steelers 2035 13 falcons 1964 28
1 6 cardinals 1894 2 bills 2058 62
. . .
18 11 dolphins 2025 17 jets 1874 52
18 31 vikings 2140 18 lions 2200 40

Each line represents one of the 272 games played during the 18 weeks of the 2024 regular season. The eight values on each line are week number, visitor team ID, visitor team name, visitor team rating according to Zoltar, home team ID, home team name, home team rating according to Zoltar, total points scored in the game. The predictor values are visitor team rating and home team rating. The other fields are for debugging.

At this point, I was ready to feed the data to various machine learning regression models to try and predict total number of points scored. For my first attempt, I used basic linear regression to establish baseline results. I realized that I needed to normalize the team rating predictor values by dividing by 10,000, and normalize the predicted total number of points scored by dividing by 100. The output of the preliminary exploration was:

Begin NFL 2024 Total Points prediction using
 baseline linear regression

Loading data into memory

First three train X:
   0.2105   0.2058
   0.2011   0.2058
   0.2035   0.1964

First three train y:
0.47
0.63
0.28

Setting lrnRate = 0.0001
Setting maxEpohcs = 200

Creating and training Linear Regression model
epoch =     0  RMSE =   0.4568
epoch =    40  RMSE =   0.1881
epoch =    80  RMSE =   0.1373
epoch =   120  RMSE =   0.1314
epoch =   160  RMSE =   0.1308
Done

Coefficients/weights:
0.0881  0.0902
Bias/constant: 0.4214

Evaluating model
Accuracy train (within 0.15) = 0.4301
Train RMSE = 0.1308

Predicting for trainX[0]:
Predicted y = 0.4585

End program

As expected, the prediction accuracy wasn’t very good. A prediction is scored as correct if it’s within 15% of the true target number of points scored. For example, if the true target for a game was 50 points, any prediction between 32.5 points and 57.5 points would be scored as correct.

I dropped the data into an Excel spreadsheet and did some analysis. The average total points scored by both teams in a game during the 272 regular season games of the 2024 season is 45.82 points. If you naively predict 45.82 points for each game, the root mean squared error of that naive model is 0.1309 points. The linear regression model RMSE of 0.1308 is almost exactly the same. So I took a closer look at the predictions from the linear regression model, and they are all between 44.0 and 46.0 — essentially just predicting the average number of points scored. Darn.

My next steps will be to try more powerful prediction algorithms, specifically kernel ridge regression and neural network regression.

Sports betting is a multi-billion dollar industry and growing very quickly.

Left: Las Vegas clearly sees sports betting as the future of gaming. This is the MGM Grand sports book. Bettors can place wagers at the walk-up counter or at computer kiosks.

Center: This is an old “chuck-a-luck” game. The birdcage had three dice. The odds are very poor and the game is not longer played as far as I know.

Right: This is an old “wheel of fortune”. The odds aren’t very good, but the game is still in Vegas, mostly because of the visual appeal.