I’m preparing the content for an all-day hands-on workshop. My main topics are all about neural networks, but I have a few classical techniques too, including naive Bayes classification. Here’s an example that I’ll use in the workshop.
There are 40 data items that look like:
actuary green korea 1 barista green italy 0 dentist hazel japan 0 chemist hazel japan 2 . . .
Each line of data is a person. The columns are job-type, eye-color, country, and personality extraversion (0, 1, 2). Suppose you want to predict the personality extraversion score of a person who is (barista, hazel, italy).
The first step is to compute the joint counts of each class (0, 1, 2) looking at each predictor variable separately (“naive”).
baker and class 0 = 3 + 1 = 4 baker and class 1 = 0 + 1 = 1 baker and class 2 = 1 + 1 = 2 hazel and class 0 = 5 + 1 = 6 hazel and class 1 = 2 + 1 = 3 hazel and class 2 = 2 + 1 = 3 italy and class 0 = 1 + 1 = 2 italy and class 1 = 5 + 1 = 6 italy and class 2 = 1 + 1 = 2
You add 1 to each raw count so that no count is 0. This is called Laplacian Smoothing.
The second step is to compute the raw counts, without smoothing, of each class:
class 0 = 19 class 1 = 14 class 2 = 7
The third step is to combine the results from step 1 and 2 using some fancy probability (“Bayes”), to get what are called evidence values (Z) for each class:
Z(0) = (5 / 19+3) * (6 / 19+3) * (2 / 19+3) * (19 / 40)
= 4/22 * 5/22 * 1/22 * 19/40
= 0.1818 * 0.2273 * 0.0435 * 0.4750
= 0.0027
Z(1) = (1 / 14+3) * (3 / 14+3) * (6 / 14+3) * (14 / 40)
= 1/17 * 3/17 * 6/17 * 14/40
= 0.0588 * 0.1765 * 0.3529 * 0.3500
= 0.0013
Z(2) = (2 / 7+3) * (3 / 7+3) * (2 / 7+3) * (7 / 40)
= 2/10 * 3/10 * 2/10 * 7/40
= 0.2000 * 0.3000 * 0.2000 * 0.1750
= 0.0021
Note: All the “+3” terms are because there are 3 predictor variables. At this point, the predicted class is the one with the largest evidence value, which is class 0.
An optional final step is to normalize the evidence values so that they sum to 1.0 and can be loosely interpreted as pseudo-probabilities. The easiest way to do this is to divide each evidence value by the sum:
sum = 0.0027 + 0.0013 + 0.0021 = 0.0061 P(class 0) = 0.0027 / 0.0061 = 0.4418 P(class 1) = 0.0013 / 0.0061 = 0.2116 P(class 2) = 0.0021 / 0.0061 = 0.3466
As before, class 0 has the largest pseudo-probability so that’s the predicted class for a (barista, hazel, italy) person.
There are many variations of naive Bayes classification. This example is just one version, for problems where the predictor values are categorical (non-numeric).

The term “naive” means simple and unsophisticated. The terms applies well to my two dogs, Kevin and Riley. Left: Kevin when he just joined my family which already included Riley. Center: I woke up from a nap one afternoon, to find that Riley had proudly brought me my “Chess Life” magazine and some socks. She is waiting for praise. Right: Kevin went through a phase where he was obsessed by socks.
Demo code:
# naive_bayes.py
# Anaconda3-2020.02 Python 3.7.6
# Windows 10/11
import numpy as np
# -----------------------------------------------------------
def main():
print("\nBegin naive Bayes classification ")
data = np.loadtxt(".\\people_data.txt", dtype=str,
delimiter=" ", comments="#")
print("\nData looks like: ")
for i in range(5):
print(data[i])
print(". . . \n")
nx = 3 # number predictor variables
nc = 3 # number classes
N = 40 # data items
joint_cts = np.zeros((nx,nc), dtype=np.int64)
y_cts = np.zeros(nc, dtype=np.int64)
# -----------------------------------------------------------
# X = ['dentist', 'hazel', 'italy']
X = ['barista', 'hazel', 'italy']
print("Item to predict/classify: ")
print(X)
for i in range(N):
y = int(data[i,nx]) # class is in last column
y_cts[y] += 1
for j in range(nx):
if data[i][j] == X[j]:
joint_cts[j][y] += 1
joint_cts += 1 # Laplacian smoothing
print("\nJoint counts (smoothed): ")
print(joint_cts)
print("\nClass counts (raw): ")
print(y_cts)
# -----------------------------------------------------------
# compute evidence terms directly
# e_terms = np.zeros(nc, dtype=np.float32)
# for k in range(nc):
# v = 1.0
# for j in range(nx):
# v *= joint_cts[j,k] / (y_cts[k] + nx)
# v *= y_cts[k] / N
# e_terms[k] = v
# -----------------------------------------------------------
# compute evidence terms using log trick to avoid underflow
e_terms = np.zeros(nc, dtype=np.float32)
for k in range(nc):
v = 0.0
for j in range(nx):
v += np.log(joint_cts[j,k]) - np.log(y_cts[k] + nx)
v += np.log(y_cts[k]) - np.log(N)
e_terms[k] = np.exp(v)
# -----------------------------------------------------------
np.set_printoptions(precision=4, suppress=True)
print("\nEvidence terms: ")
print(e_terms)
sum_evidence = np.sum(e_terms)
probs = np.zeros(nc, dtype=np.float32)
for k in range(nc):
probs[k] = e_terms[k] / sum_evidence
print("\nPseudo-probabilities: ")
print(probs)
pc = np.argmax(probs)
print("\nPredicted class: ")
print(pc)
print("\nEnd naive Bayes demo ")
if __name__ == "__main__":
main()
Demo data:
# people_data.txt # job-type eye-color country extraversion # actuary green korea 1 barista green italy 0 dentist hazel japan 0 dentist green japan 1 chemist hazel japan 2 actuary green japan 1 actuary green japan 0 chemist green italy 1 chemist green italy 2 dentist green japan 1 dentist green japan 0 dentist green japan 1 dentist green japan 2 chemist green italy 1 dentist green japan 1 dentist hazel japan 0 chemist green korea 1 barista green japan 0 actuary green italy 1 actuary green italy 1 dentist green korea 0 barista green japan 2 dentist green japan 0 barista green korea 0 dentist green japan 0 actuary hazel italy 1 dentist hazel japan 0 dentist green japan 2 dentist green japan 0 chemist hazel japan 2 dentist green korea 0 dentist hazel korea 0 dentist green japan 0 dentist green japan 2 dentist hazel japan 0 actuary hazel japan 1 actuary green japan 0 actuary green japan 1 dentist green japan 0 barista green japan 0

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.