In a standard classification problem, the goal is to predict a class label. For example, in the Iris Dataset problem, the goal is to predict a species of flower: 0 = “setosa”, 1 = “versicolor”, 2 = “virginica”. Here the class labels are just labels wthout any meaning attache to the order. In an ordinal classification problem (also called ordinal regression), the class labels have order. For example, you might want to predict the median price of a house in one of 506 towns, where price can be 0 = very low, 1 = low, 2 = medium, 3 = high, 4 = very high. For an ordinal classification problem, you could just use standard classification, but that approach doesn’t take advantage of the ordering information in the training data.
I coded up a demo of a simple technique using the PyTorch code library. The same technique can be used with Keras/TensorFlow too.
I used a modified version of the Boston Housing dataset. There are 506 data items. Each item is a town near Boston. There are 13 predictor variables — crime rate in town, tax rate in town, proportion of Black residents in town, and so on. The original Boston dataset contains the median price of a house in each town, divided by $1,000 — like 35.00 for $35,000 (the data is from the 1970s when house prices were low). To convert the data to an ordinal classification problem, I mapped the house prices like so:
price class count
[$0 to $10,000) 0 24
[$10,000 to $20,000) 1 191
[$20,000 to $30,000) 2 207
[$30,000 to $40,000) 3 53
[$40,000 to $50,000] 4 31
---
506
I normalized the numeric predictor values by dividing by a constant so that each normalized value is between -1.0 and +1.0. I encoded the single Boolean predictor value (does town border the Charles River) as -1 (no), +1 (yes).
The technique I used for ordinal classification is something I invented myself, at least as far as I know. I’ve never seen the technique I used anywhere else, but it’s not too complicated and so it could exist under an obscure name of some sort.
For the modified Boston Housing dataset there are k = 5 classes. The class target values in the training data are (0, 1, 2, 3, 4). My neural network system outputs a single numeric value between 0.0 and 1.0 — for example 0.2345. The class target values of (0, 1, 2, 3, 4) generate associated floating point sub-targets of (0.1, 0.3, 0.5, 0.7, 0.9). When I read the data into memory as a PyTorch Dataset object, I map each ordinal class label to the associated floating point target. Then I use standard MSELoss() to train the network.
Suppose a data item has class label = 3 (high price). The target value for that item is stored as 0.7. The computed predicted price will be something like 0.66 (close to target, so low MSE error and a correct prediction) or maybe 0.23 (far from target, so high MSE error and a wrong prediction). With this scheme, the ordering information is used.
For implementation, most of the work is done inside the Dataset object:
class BostonDataset(T.utils.data.Dataset):
# features are in cols [0,12], median price as int in [13]
def __init__(self, src_file, k):
# k is for class_to_target_program()
tmp_x = np.loadtxt(src_file, usecols=range(0,13),
delimiter="\t", comments="#", dtype=np.float32)
tmp_y = np.loadtxt(src_file, usecols=13,
delimiter="\t", comments="#", dtype=np.int64)
n = len(tmp_y)
float_targets = np.zeros(n, dtype=np.float32) # 1D
for i in range(n): # hard-coded is easy to understand
if tmp_y[i] == 0: float_targets[i] = 0.1
elif tmp_y[i] == 1: float_targets[i] = 0.3
elif tmp_y[i] == 2: float_targets[i] = 0.5
elif tmp_y[i] == 3: float_targets[i] = 0.7
elif tmp_y[i] == 4: float_targets[i] = 0.9
else: print("Fatal logic error ")
float_targets = np.reshape(float_targets, (-1,1)) # 2D
self.x_data = \
T.tensor(tmp_x, dtype=T.float32).to(device)
self.y_data = \
T.tensor(float_targets, dtype=T.float32).to(device)
def __len__(self):
return len(self.x_data)
def __getitem__(self, idx):
preds = self.x_data[idx] # all cols
price = self.y_data[idx] # all cols
return (preds, price) # tuple of two matrices
There are a few minor, but very tricky details. They’d take much too long too explain in a blog post, so I’ll just say that if you’re interested, examine the code very carefully.

I don’t think it’s possible to assign a strictly numeric value to art. Here are two clever illustrations by artist Casimir Lee. I like the bright colors and combination of 1920s art deco style with 1960s psychedelic style.
Code below. Long.
# boston_ordinal_simplified.py
# ordinal regression on Boston Housing dataset
# data class labels are 0,1,2,3,4 - on fly convert to
# targets 0.1, 0.3, 0.5, 0.7, 0.9 -- 1/(k+1)
# PyTorch 1.9.0-CPU Anaconda3-2020.02 Python 3.7.6
# Windows 10
import numpy as np
import torch as T
device = T.device("cpu")
# -----------------------------------------------------------
# crime zoning indus river nox rooms oldness dist access
# 0 1 2 3 4 5 6 7 8
# tax pup_tch black low_stat med_val
# 9 10 11 12 13
class BostonDataset(T.utils.data.Dataset):
# features are in cols [0,12], median price as int in [13]
def __init__(self, src_file, k):
# k is for class_to_target_program()
tmp_x = np.loadtxt(src_file, usecols=range(0,13),
delimiter="\t", comments="#", dtype=np.float32)
tmp_y = np.loadtxt(src_file, usecols=13,
delimiter="\t", comments="#", dtype=np.int64)
n = len(tmp_y)
float_targets = np.zeros(n, dtype=np.float32) # 1D
for i in range(n): # hard-coded is easy to understand
if tmp_y[i] == 0: float_targets[i] = 0.1
elif tmp_y[i] == 1: float_targets[i] = 0.3
elif tmp_y[i] == 2: float_targets[i] = 0.5
elif tmp_y[i] == 3: float_targets[i] = 0.7
elif tmp_y[i] == 4: float_targets[i] = 0.9
else: print("Fatal logic error ")
float_targets = np.reshape(float_targets, (-1,1)) # 2D
self.x_data = \
T.tensor(tmp_x, dtype=T.float32).to(device)
self.y_data = \
T.tensor(float_targets, dtype=T.float32).to(device)
def __len__(self):
return len(self.x_data)
def __getitem__(self, idx):
preds = self.x_data[idx] # all cols
price = self.y_data[idx] # all cols
return (preds, price) # tuple of two matrices
# -----------------------------------------------------------
class Net(T.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.hid1 = T.nn.Linear(13, 10) # 13-(10-10)-1
self.hid2 = T.nn.Linear(10, 10)
self.oupt = T.nn.Linear(10, 1)
T.nn.init.xavier_uniform_(self.hid1.weight) # glorot
T.nn.init.zeros_(self.hid1.bias)
T.nn.init.xavier_uniform_(self.hid2.weight)
T.nn.init.zeros_(self.hid2.bias)
T.nn.init.xavier_uniform_(self.oupt.weight)
T.nn.init.zeros_(self.oupt.bias)
def forward(self, x):
z = T.relu(self.hid1(x)) # or T.nn.Tanh()
z = T.relu(self.hid2(z))
z = T.sigmoid(self.oupt(z))
return z
# -----------------------------------------------------------
def class_to_target(c, k):
if c == 0: return 0.1
elif c == 1: return 0.3
elif c == 2: return 0.5
elif c == 3: return 0.6
elif c == 4: return 0.9
def class_to_target_program(c, k):
# mildly inefficient to compute targets every time
targets = np.zeros(k, dtype=np.float32)
start = 1.0 / (2 * k)
delta = 1.0 / k
for i in range(k):
targets[i] = start + (i * delta)
return targets[c]
# ----------------------------------------------------------
def oupt_to_class(oupt, k):
if oupt "gte" 0.0 and oupt "lt" 0.2: return 0
elif oupt "gte" 0.2 and oupt "lt" 0.4: return 1
elif oupt "gte" 0.4 and oupt "lt" 0.6: return 2
elif oupt "gte" 0.6 and oupt "lt" 0.8: return 3
elif oupt "gte" 0.8 and oupt "lte" 1.0: return 4
def oupt_to_class_program(oupt, k):
# mildly inefficient to compute end_pts every time
end_pts = np.zeros(k+1, dtype=np.float32)
delta = 1.0 / k
for i in range(k):
end_pts[i] = i * delta
end_pts[k] = 1.0
# [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
for i in range(k):
if oupt "gte" end_pts[i] and oupt "lte" end_pts[i+1]:
return i
return -1 # fatal error
# -----------------------------------------------------------
def accuracy(model, ds, k):
n_correct = 0; n_wrong = 0
delta = (1.0 / k) / 2
for i in range(len(ds)): # each input
(X, y) = ds[i] # (predictors, target)
with T.no_grad(): # y target is like 0.3
oupt = model(X) # oupt is in [0.0, 1.0]
if T.abs(oupt - y) <= delta:
n_correct += 1
else:
n_wrong += 1
acc = (n_correct * 1.0) / (n_correct + n_wrong)
return acc
# -----------------------------------------------------------
def train(net, ds, bs, lr, me, le):
# network, dataset, batch_size, learn_rate,
# max_epochs, log_every
train_ldr = T.utils.data.DataLoader(ds,
batch_size=bs, shuffle=True)
loss_func = T.nn.MSELoss()
opt = T.optim.Adam(net.parameters(), lr=lr)
for epoch in range(0, me):
# T.manual_seed(1+epoch) # recovery reproducibility
epoch_loss = 0 # for one full epoch
for (b_idx, batch) in enumerate(train_ldr):
(X, y) = batch # (predictors, targets)
opt.zero_grad() # prepare gradients
oupt = net(X) # predicted prices
loss_val = loss_func(oupt, y) # a tensor
epoch_loss += loss_val.item() # accumulate
loss_val.backward() # compute gradients
opt.step() # update weights
if epoch % le == 0:
print("epoch = %4d loss = %0.4f" % \
(epoch, epoch_loss))
# TODO: save checkpoint
# -----------------------------------------------------------
def main():
# 0. get started
print("\nBegin predict Boston ordinal regression (price) ")
print("Simplified version computes float targets on fly ")
T.manual_seed(1)
np.random.seed(1)
# 1. create DataLoader object
print("Creating Boston Dataset object ")
train_file = ".\\Data\\boston_ordinal.txt"
train_ds = BostonDataset(train_file, k=5) # 5 classes
# 2. create network
net = Net().to(device)
net.train() # set mode
# 3. train model
bat_size = 10
lrn_rate = 0.010
max_epochs = 500
log_every = 100
print("\nbat_size = %3d " % bat_size)
print("lrn_rate = %0.3f " % lrn_rate)
print("loss = MSELoss ")
print("optimizer = Adam ")
print("max_epochs = %3d " % max_epochs)
print("\nStarting training ")
train(net, train_ds, bat_size, lrn_rate,
max_epochs, log_every)
print("Training complete ")
# 4. evaluate model accuracy
print("\nComputing model accuracy")
net.eval()
acc_train = accuracy(net, train_ds, k=5)
print("Accuracy on train data = %0.4f" % \
acc_train)
# 5. use model to make a prediction
np.set_printoptions(precision=6,
suppress=True, sign=" ")
x = np.array([[0.000063, 0.18, 0.0231, -1,
0.538, 0.6575, 0.652, 0.0409,
0.0100, 0.296, 0.153, 0.3969,
0.0498]], dtype=np.float32) # '2'
print("\nPredicting house price for: ")
print(x)
x = T.tensor(x, dtype=T.float32)
with T.no_grad():
oupt = net(x)
print("\npredicted price (normed) = %0.4f " % oupt)
c = oupt_to_class(oupt, k=5)
print("predicted class label = %d " % c)
print("\nEnd Boston ordinal price demo")
if __name__ == "__main__":
main()

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.