Over the past three years or so, neural networks have come to dominate many areas of machine learning. But traditional ML techniques are still useful in some scenarios.
A decision tree is just a set of if-then rules for classification. There are many decision tree standalone tools and code libraries. For relatively simple problems, my usual approach is to use the scikit-learn (sklearn) code library. The sklearn library does not support categorical predictor variables (except binary categorical predictors) — I have seen a lot of incorrect information on this topic. Categorical predictors in a decision tree is a very tricky subject.
There are many good examples of creating a decision tree using scikit-learn, but I never fully understand a technology until I dive into the code myself. So I created an end-to-end example to refresh my memory.
I created a dummy dataset with 12 items:
28 female 27000 democrat 39 male 97000 independent 38 female 64000 republican 27 male 82000 independent 36 male 48000 democrat 55 female 56000 democrat 44 male 88000 independent 42 male 39000 republican 21 male 43000 republican 49 female 91000 independent 30 female 85000 democrat 56 male 41000 republican
The first three columns are age, sex, annual income. The goal is to predict political party affiliation, in the fourth column. A sklearn decision tree only accepts numeric predictors (but allows non-numeric labels-to-predict) so I converted the male-female data to 0-1:
28 1 27000 democrat 39 0 97000 independent 38 1 64000 republican 27 0 82000 independent 36 0 48000 democrat 55 1 56000 democrat 44 0 88000 independent 42 0 39000 republican 21 0 43000 republican 49 1 91000 independent 30 1 85000 democrat 56 0 41000 republican
Next, I wrote a program, using parts of several examples I found, plus a bit of new stuff I figured out myself though experimentation and the sklearn documentation.
I don’t use decision trees very often. But for very small datasets or when it’s important to be able to interpret how a prediction model makes a specific prediction, decision trees can be very useful.

An Internet image search for just about any word returns some strange results. “Trees”

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2025 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2025 G2E Conference
2025 iSC West Conference
You must be logged in to post a comment.