Until relatively recently, the traditional way to do multi-class classification with a neural network is to 1.) encode the data file labels-to-predict using one-hot encoding (like “0, 1, 0” or “1, 0, 0”), 2.) make a neural network with softmax activation on the output nodes, 3.) train using mean squared error.
But by far the most common way to do multi-class classification with a PyTorch network is to 1.) encode the data file labels-to-predict using ordinal encoding (like “0” or “1” or “2”), 2.) make a neural network with no activation on the output nodes, 3.) train using the special CrossEntropyLoss() function.
I recently explored creating a PyTorch multi-class classifier using the older traditional approach using mean squared error. While I was coding that experiment I realized that an efficient variation would be to use an ordinal encoded data file rather than a one-hot encoded file, then when reading the data into memory, programmatically convert the ordinal values to one-hot vectors.
To cut to the chase, the idea worked as expected.
Left: A program that uses MSELoss() with one-hot encoded data. Right: A program that uses MSELoss() with ordinal encoded data that is programmatically converted to one-hot vectors in memory. Both programs give identical results, which is what should happen.
I implemented the idea for the Iris data where there are four predictor variables and three species to predict. The key to the idea was implementing a PyTorch Dataset object to read the ordinal encoded data, and convert it to one-hot encoded tensor data:
class IrisDataset(T.utils.data.Dataset):
def __init__(self, src_file, num_rows=None):
# like 5.0, 3.5, 1.3, 0.3, 2
# convert ordinal label to one-hot vectot
tmp_x_data = np.loadtxt(src_file, max_rows=num_rows,
usecols=range(0,4), delimiter=",", skiprows=0,
dtype=np.float32)
tmp_y_data = np.loadtxt(src_file, max_rows=num_rows,
usecols=[4], delimiter=",", skiprows=0,
dtype=np.int64)
self.x_data = T.tensor(tmp_x_data,
dtype=T.float32).to(device)
n_rows = len(tmp_y_data)
n_cols = 3
dims = (n_rows, n_cols)
self.y_data = T.zeros(dims,
dtype=T.float32).to(device)
for i in range(n_rows):
j = tmp_y_data[i] # the ord value 0, 1, or 2
self.y_data[i][j] = 1.0
self.num_rows=n_rows
def __len__(self):
return self.num_rows
def __getitem__(self, idx):
if T.is_tensor(idx):
idx = idx.tolist()
preds = self.x_data[idx]
spcs = self.y_data[idx]
sample = { 'predictors' : preds, 'species' : spcs }
return sample
There isn’t much code in the Dataset class but it’s surprisingly tricky in the details. The indirect moral of the story is this: Any successful project or company or effort needs people who see “the big picture” — the idea people. But success only happens with execution. Especially in tech companies, it’s easy to come up with a big idea, but it’s never easy to write the code to make that idea come to life.
Two somewhat enigmatic illustrations by Japanese artist Ichiro Tsuruta (1954-). He is known for “bijinga” (meaning beautiful female figure) works like these. It’s easy for someone to think of creating an illustration like any of these, but it’s another matter to execute. I don’t find these illustrations particularly appealing, but I have great respect for the effort required to create them.


.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2025 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2025 G2E Conference
2025 iSC West Conference
You must be logged in to post a comment.