The goal of a machine learning model is to predict a single numeric value (regression) or a single discreet value such as the poltical leaning of a person (classification and binary classification). To create a model you must have data. For example, suppose some data looks like:
F, 24, Michigan, 29500.00, liberal M, 39, Oklahoma, 51200.00, moderate F, 63, Nebraska, 75800.00, conservative . . .
The fields are sex, age, State, income, politics. Using this data you could predict any of the variables from the other variables, for example, predict income from sex, age, State, and politics.
Real-life data often has missing values. For example:
F, 24, Michigan, missing, liberal M, 39, Oklahoma, 51200.00, moderate missing, 63, Nebraska, 75800.00, conservative . . .
The obvious, and best approach is to toss out data rows that have one or more missing values. But for some reason, a standard machine learning technique for missing data is to supply imputed values. For example, for the missing sex value in the third row, you could insert the most common sex, male or female. And for the missing income value in the first row, you could insert the average of the income values.
The scikit-learn library has a module for supplying imputed values, but I can’t think of any scenarios where using it would be a good idea.
Imputing missing values makes absolutely no sense to me from a principled point of view. At best, the resulting prediction model will be sketchy, and the model could be flat-out misleading.
There’s no big moral to this post other than common sense should always prevail.

Missing data in machine learning is always bad. But missing details in art is a good thing. I don’t like photo-realistic art — I much prefer a certain level of abstraction where detail is missing.

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.