Why PyTorch Binary Cassification Labels Must Be Floats But Multi-Class Classification Labels Must Be Integers

Suppose you have a dataset that looks like this:

# sex age   state  income  politics
   0  0.27  0 1 0  0.7610  0 0 1
   1  0.19  0 0 1  0.6550  1 0 0
. . . 

Now suppose you want to predict sex (male = 0, female = 1) from age, state of residence, income, and political leaning. This is a binary classification problem. You would read the data into memory as PyTorch tensors where the predictors are floats and also the labels are floats:

  import torch as T
  import numpy as np
  device = T.device('cpu')

  all_data = np.loadtxt(src_file, usecols=range(0,9),
    delimiter="\t", comments="#", dtype=np.float32) 

  self.x_data = T.tensor(all_data[:,1:9],
    dtype=T.float32).to(device)
  self.y_data = T.tensor(all_data[:,0],
    dtype=T.float32).to(device)  # float32 required

  self.y_data = self.y_data.reshape(-1,1)  # 2-D required

Even though conceptually the sex labels are categorical, PyTorch requires you to read them as floats if you use binary cross entropy loss (BCELoss).


A demo of binary classification with incorrect labels read into memory as integers instead of floats.

If t is the target label value (0 or 1) and y is the computed predicted value (between 0.0 and 1.0), binary cross entropy = -[log(y) * t + log(1-y) * (1-t)]. If t = 1, then the (1-t) term drops out leaving -log(y) and if t = 0, then the t drops out leaving -log(1-y). If the t labels are integers, there would be implicit type cast to floats. This isn’t really a huge issue but the designers of PyTorch decided to require labels to be type float.

Now suppose your data is:

# sex age   state  income  politics
  -1  0.27  0 1 0  0.7610    2
   1  0.19  0 0 1  0.6550    0
. . . 

And you want to predict political leaning (conservative = 0, moderate = 1, liberal = 2) from sex, age, state, and income. This is a multi-class classification problem. In this case you must read the the predictors as floats and the labels as integers:

  import torch as T
  import numpy as np
  device = T.device('cpu')

  all_data = np.loadtxt(src_file, usecols=range(0,7),
    delimiter="\t", comments="#", dtype=np.float32) 

  self.x_data = T.tensor(all_data[:,0:6],
    dtype=T.float32).to(device)
  self.y_data = T.tensor(all_data[:,6],
    dtype=T.int64).to(device)  # int, 1-D required

The target labels are integers because they will be used as indices into the output vectors, which are log-softmax values.

Notice that in the binary classification scenario, labels are read into a 2-D matrix but int the multi-class scenario, labels are read into a 1-D vector. Dealing with PyTorch data shapes can be very tricky and time-consuming during development.



Attention to detail is important in machine learning. Two cartoons that illustrate lack of attention to detail by law enforcement. By cartoonist Jim Unger (1937-2012).


This entry was posted in PyTorch. Bookmark the permalink.