An Example of Isotonic Regression Using the scikit Library

Isotonic regression is a very niche machine learning technique. The basic scenario is you have a single predictor variable (for example, the size of something) and a single target value to predict (for example, the porosity of the something), where the values of the predicted targets consistently increase or decrease (with a few exceptions allowed).

I put together a demo using the scikit-learn IsotonicRegression module. But what I found most interesting was that, when I did my research, I found a lot of conflicting, contradictory, and just plain incorrect information on the Internet. Briefly, I tracked down the primary source research papers and discovered that isotonic regression is quite complicated — much more complicated than most of the Internet resources indicated.

The following graph illustrates my demo. The artificial training data has 10 x values: (0.0, 1.0, 2.0, . . 9.0). The training y values increase with just one exception: (3.0, 9.0, 14.0, 18.0, 21.0, 23.0, 24.0, 20.0, 25.0, 27.0). The 10 training data items are the large red dots. I generated 40 training x data items and used the trained model to compute the 40 predicted y values. They are the medium size blue dots on the graph. And to illustrate why isotonic regression might be used, I added a simple linear regression line (the small green dots).

There’s no significant moral to this post/story except maybe to point out that just because some information is on the Internet, it doesn’t mean that information is accurate. (Including this blog post!)

Two pages from a research paper that explains how isotonic regression works. Complicated stuff!

Demo program.

# isotonic_regression_scikit.py

import numpy as np
from sklearn.isotonic import IsotonicRegression

np.set_printoptions(precision=1, suppress=True)

# training data
X = np.array([0,1,2,3,4,5,6,7,8,9],
  dtype=np.float64)
y = np.array([3, 9, 14, 18, 21, 23,
  24, 20, 25, 27], dtype=np.float32)

print(X); input()
print(y); input()

iso_reg = IsotonicRegression(out_of_bounds='clip')
iso_reg.fit(X, y)
pred = iso_reg.predict(X)
print(pred); input()

x = -1.0
for i in range(60):
 y_pred = iso_reg.predict([x])
 print("%0.1f  %0.1f" % (x, y_pred[0]))
 x += 0.25

print("\nEnd demo ")