Computing Model Accuracy for Keras Regression Models

I recently upgraded my Keras neural network code library version to 2.6.0 and decided to revisit my three basic examples — Iris (multi-class classification), Banknote (binary classification), and Boston (regression). This morning I refactored my Boston example.

Even though it had only been a few months since I last did the Boston example, I was surprised at how much Keras had changed, and how much my preferred techniques had changed.

The goal of the Boston Housing Dataset example is to predict the median house price in one of 506 towns near Boston. There are 13 predictor variables — average number of rooms in a house in the town, tax rate in the town, percentage of Black people in town, and so on.

I used order-magnitude normalization on the numeric predictors, then randomly split the 506-item into a training set (400 items) and a test set (106 items). Preparing even simple data like this is tedious and very time-consuming. See https://jamesmccaffreyblog.com/2021/08/18/preparing-the-boston-housing-dataset-for-pytorch/.

For my new version of the Boston example, one of the main changes I made was to write a much faster function to compute model accuracy. When you do classification with Keras, like the Iris example, you get a built-in accuracy function. But with regression, you must write your own function. For classification, a prediction of a discrete value like “red” is either correct or incorrect. But for regression, a prediction of a house price like 0.49500 = $49,500 will never be perfectly correct so you must determine if the prediction is within a specified percent of the target value.

When I use the PyTorch library, my standard approach for regression accuracy is:

loop
  get a line of input data
  get target value
  feed input to model, get predicted value
  if predicted is close enough to target
    num_correct += 1
  else
    num_wrong += 1
  end-if
end-loop
return num_correct / num_correct + num_wrong

This approach examines one data item at a time. The technique is simple, and allows you to examine each item to see why a prediction was correct or wrong. But when using Keras, the item-by-item approach is painfully slow. I’m not sure why Keras is roughly 10 times slower than PyTorch when computing accuracy item-by-item.

So, a quick approach is useful. The idea is to compute all outputs at once.

read all input data
read all target values
feed all input, get all predicted values
compare all predicted to all targets
 ( result is like [1, 0, 0, 1, . . ] )
sum the comparison results
return sum / num_items

The ideas are simple and the code is short, but implementation is very, very tricky. Here is my implementation for the Boston data:

def accuracy_quick(model, data_x, data_y, pct):
  n = len(data_x)
  oupt = model(data_x)
  oupt = tf.reshape(oupt, [-1])  # 1D
 
  max_deltas = tf.abs(pct * data_y)    # max allowable deltas
  abs_deltas = tf.abs(oupt - data_y)   # actual differences
  results = abs_deltas "lt" max_deltas    # [True, False, . .]

  n_correct = np.sum(results)
  acc = n_correct / n
  return acc

Even though a quick accuracy function implementation is tricky, once you know how to compute quick accuracy for one specific regression problem, it’s relatively easy to adapt the code to any other regression problem.

The digital artist who goes by the name “batjorge” uses fractal generation software (Mandelbulb) to create interesting images of alien mushrooms and fungi. Are the images accurate? For art, the concept of accuracy doesn’t usually apply.

Code below. Very long.

# boston_tfk.py
# regression on the Boston Housing dataset
# Keras 2.6.0 in TensorFlow 2.6.0 ("_tfk")
# Anaconda3-2020.02  Python 3.7.6  Windows 10

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'  # suppress CPU warn

import numpy as np
import tensorflow as tf
from tensorflow import keras as K

class MyLogger(K.callbacks.Callback):
  def __init__(self, n, data_x, data_y, pct_close):
    self.n = n   # print loss & acc every n epochs
    self.data_x = data_x
    self.data_y = data_y
    self.pct_close = pct_close

  def on_epoch_end(self, epoch, logs={}):
    if epoch % self.n == 0:
      curr_loss = logs.get('loss')  # loss on curr batch
      print("epoch = %4d  curr batch loss (MSE) = \
%0.6f " % (epoch, curr_loss))

def accuracy_slow(model, data_x, data_y, pct_close):
  # item-by-item -- slow -- for debugging
  n_correct = 0; n_wrong = 0
  n = len(data_x)
  for i in range(n):
    x = np.array([data_x[i]])  # [[ x ]]
    predicted = model.predict(x)  
    actual = data_y[i]
    if np.abs(predicted[0][0] - actual) "lt" \
      np.abs(pct_close * actual):
      n_correct += 1
    else:
      n_wrong += 1
  return (n_correct * 1.0) / (n_correct + n_wrong)

def accuracy_quick(model, data_x, data_y, pct):
  n = len(data_x)
  oupt = model(data_x)
  oupt = tf.reshape(oupt, [-1])  # 1D
 
  max_deltas = tf.abs(pct * data_y)    # max allowable deltas
  abs_deltas = tf.abs(oupt - data_y)   # actual differences
  results = abs_deltas "lt" max_deltas    # [True, False, . .]

  n_correct = np.sum(results)
  acc = n_correct / n
  return acc

def main():
  print("\nBoston Housing regression example ")
  np.random.seed(1)
  tf.random.set_seed(1)

  kv = K.__version__
  print("Using Keras: ", kv)

  print("\nLoading Boston data into memory ")
  train_file = ".\\Data\\boston_train.txt"  # 400 lines
  train_x = np.loadtxt(train_file, 
    usecols=[0,1,2,3,4,5,6,7,8,9,10,11,12],
    delimiter="\t", skiprows=0, dtype=np.float32)
  train_y = np.loadtxt(train_file, usecols=[13],
    delimiter="\t", skiprows=0, dtype=np.float32)

  test_file = ".\\Data\\boston_test.txt"  # 106 lines
  test_x = np.loadtxt(test_file, 
    usecols=[0,1,2,3,4,5,6,7,8,9,10,11,12],
    delimiter="\t", skiprows=0, dtype=np.float32)
  test_y = np.loadtxt(test_file, usecols=[13],
    delimiter="\t", skiprows=0, dtype=np.float32)

  print("\nCreating 13-10-10-1 neural network ")
  model = K.models.Sequential()
  model.add(K.layers.Dense(units=10, input_dim=13,
    activation='tanh'))  # hid1
  model.add(K.layers.Dense(units=10,
    activation='tanh'))  # hid2
  model.add(K.layers.Dense(units=1,
    activation=None))    # output layer
  opt = K.optimizers.Adam(learning_rate=0.01)
  model.compile(loss='mean_squared_error',
    optimizer=opt, metrics=['mse'])

  max_epochs = 1000
  log_every = 100
  my_logger = MyLogger(log_every, train_x,
    train_y, 0.15) 

  print("\nStarting training ")
  h = model.fit(train_x, train_y, batch_size=4,
    epochs=max_epochs, verbose=0, callbacks=[my_logger])
  print("Training finished ")

  eval = model.evaluate(train_x, train_y, verbose=0)
  # [0] = loss (mse), [1] = compile-metrics = 'mse' again
  print("\nFinal overall loss (MSE) on train = %0.6f" % eval[0])
  eval = model.evaluate(test_x, test_y, verbose=0)
  print("Final overall loss (MSE) on test = %0.6f" % eval[0])

  # loss_list = h.history['loss']  # loss LAST BATCH each epoch
  # print(loss_list[len(loss_list)-1])

  # train_acc = accuracy_slow(model, train_x, train_y, 0.15) 
  # print("\nAccuracy on train data = %0.4f" % train_acc)
  # test_acc = accuracy_slow(model, test_x, test_y, 0.15) 
  # print("Accuracy on test data = %0.4f" % test_acc)

  train_acc = accuracy_quick(model, train_x, train_y, 0.15) 
  print("\nAccuracy on train data = %0.4f" % train_acc)
  test_acc = accuracy_quick(model, test_x, test_y, 0.15) 
  print("Accuracy on test data = %0.4f" % test_acc)

  print("\nSaving trained model as boston_model.h5 ")
  # model.save_weights(".\\Models\\boston_model_wts.h5")
  # model.save(".\\Models\\boston_model.h5")

  np.set_printoptions(formatter={'float': '{: 0.6f}'.format})
  x = np.array([[0.000493, 0.330000, 0.021800, -1, 0.472000,
    0.684900, 0.703000, 0.031827, 0.070000, 0.222000,
    0.184000, 0.396900, 0.075300]],
    dtype=np.float32)
  predicted = model.predict(x)
  print("\nUsing model to predict median price for features: ")
  print(x)
  print("\nPredicted price is: ")  # expected = 0.282000
  print(predicted)

  print("\nEnd demo ")

if __name__=="__main__":
  main()