Suppose you have a dataset that looks like this:
# sex age state income politics 0 0.27 0 1 0 0.7610 0 0 1 1 0.19 0 0 1 0.6550 1 0 0 . . .
Now suppose you want to predict sex (male = 0, female = 1) from age, state of residence, income, and political leaning. This is a binary classification problem. You would read the data into memory as PyTorch tensors where the predictors are floats and also the labels are floats:
import torch as T
import numpy as np
device = T.device('cpu')
all_data = np.loadtxt(src_file, usecols=range(0,9),
delimiter="\t", comments="#", dtype=np.float32)
self.x_data = T.tensor(all_data[:,1:9],
dtype=T.float32).to(device)
self.y_data = T.tensor(all_data[:,0],
dtype=T.float32).to(device) # float32 required
self.y_data = self.y_data.reshape(-1,1) # 2-D required
Even though conceptually the sex labels are categorical, PyTorch requires you to read them as floats if you use binary cross entropy loss (BCELoss).

A demo of binary classification with incorrect labels read into memory as integers instead of floats.
If t is the target label value (0 or 1) and y is the computed predicted value (between 0.0 and 1.0), binary cross entropy = -[log(y) * t + log(1-y) * (1-t)]. If t = 1, then the (1-t) term drops out leaving -log(y) and if t = 0, then the t drops out leaving -log(1-y). If the t labels are integers, there would be implicit type cast to floats. This isn’t really a huge issue but the designers of PyTorch decided to require labels to be type float.
Now suppose your data is:
# sex age state income politics -1 0.27 0 1 0 0.7610 2 1 0.19 0 0 1 0.6550 0 . . .
And you want to predict political leaning (conservative = 0, moderate = 1, liberal = 2) from sex, age, state, and income. This is a multi-class classification problem. In this case you must read the the predictors as floats and the labels as integers:
import torch as T
import numpy as np
device = T.device('cpu')
all_data = np.loadtxt(src_file, usecols=range(0,7),
delimiter="\t", comments="#", dtype=np.float32)
self.x_data = T.tensor(all_data[:,0:6],
dtype=T.float32).to(device)
self.y_data = T.tensor(all_data[:,6],
dtype=T.int64).to(device) # int, 1-D required
The target labels are integers because they will be used as indices into the output vectors, which are log-softmax values.
Notice that in the binary classification scenario, labels are read into a 2-D matrix but int the multi-class scenario, labels are read into a 1-D vector. Dealing with PyTorch data shapes can be very tricky and time-consuming during development.

Attention to detail is important in machine learning. Two cartoons that illustrate lack of attention to detail by law enforcement. By cartoonist Jim Unger (1937-2012).
.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.