One of my standard neural network examples is sentiment classification on the IMDB Movie Review dataset. The goal is to predict the sentiment (0 = negative, 1 = positive) of a natural language movie review such as, “The movie was a great waste of my time.” This is a very difficult natural language processing (NLP) problem.
My basic example uses an LSTM (long, short-term memory) architecture. I have an advanced version that uses Transformer Architecture, but that’s another topic.
Processing the data to get it ready for a neural system is a big challenge. I fetched the raw movie review data from https://ai.stanford.edu/~amaas/data/sentiment/. The movie review data is in gnu-zip, tape-archive format. I extracted the data using the 7-Zip utility program. This created a complex set of files. There are a total of 50,000 movie reviews — 25,000 for training and 25,000 for testing. Each set has 12,500 positive reviews and 12,500 negative reviews.
I wrote a Python language helper program that filtered the training and test reviews down to those with 50 words or less. The program converts each word to an integer ID where low numbers are common words, e.g. ID = 4 is “the”, ID = 5 is “and”, and so on. ID = 0 is used for padding so that all reviews have exactly 50 tokens. See https://jamesmccaffreyblog.com/2022/01/17/imdb-movie-review-sentiment-analysis-using-an-lstm-with-pytorch/.
The result is a file where each line is a movie review. The first 50 values are the encoded movie review words and the last value on the line is 0 (negative review) or 1 (positive review). I put the padding 0s at the beginning of each review that’s shorter than 50 words but I could have put them at the end.
I designed a simple LSTM network. I used an Embedding layer where each word/token ID is converted to a vector of 32 values. My simple LSTM layer uses an internal state size of 100 values. I used a batch-first geometry (dealing with the geometries of NLP systems is a big pain). I applied a Dropout layer to limit overfitting and a Linear layer with sigmoid() activation to condense the output to a single value between 0.0 and 1.0 and so an output of less than 0.5 means class 0 = negative review, otherwise class 1 = positive review.
For training, I used Adam optimization with an initial learning rate of 0.001 and a batch size of 16 reviews.
The demo achieved about 79% accuracy on the test data: pretty weak because I didn’t use enough reviews. I got much better results by using reviews with 80 words or less.
NLP problems are, in my opinion, among the most difficult in all of machine learning. But they’re very interesting.

In the days before the Internet, movie posters were extremely important for marketing. The older a movie poster is, the more likely it is to have more detail and hints about the plot and characters. Left: “Dr. No” (1962), the first Bond movie, starring Sean Connery. Center: “GoldenEye” (1995), starring Pierce Brosnan. Right: “Casino Royale” (2006), starring Daniel Craig.
Demo code. Replace “lt”, “gt”, “lte”, “gte” with Boolean operator symbols. My lame blog editor chokes on symbols.
# imdb_lstm_tfk.py
# LSTM for sentiment analysis on the IMDB dataset
# Anaconda3-2020.02 (Python 3.7.6)
# TensorFlow 2.8.0 (includes KerasTF 2.8.0)
# Windows 10/11
# -----------------------------------------------------------
import numpy as np
import tensorflow as tf
from tensorflow import keras as K
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
# -----------------------------------------------------------
class MyLogger(K.callbacks.Callback):
def __init__(self, n):
self.n = n # print loss & acc every n epochs
def on_epoch_end(self, epoch, logs={}):
if epoch % self.n == 0:
curr_loss =logs.get('loss')
curr_acc = logs.get('accuracy') * 100
print("epoch = %4d | loss = %0.6f | acc = %0.2f%%" \
% (epoch, curr_loss, curr_acc))
# -----------------------------------------------------------
def main():
# 0. get started
print("\nBegin Keras IMDB LSTM demo ")
print("Using only reviews with 50 or less words ")
np.random.seed(3)
tf.random.set_seed(3)
# 1. load data
print("\nLoading preprocessed train and test data ")
train_file = ".\\Data\\imdb_train_50w.txt"
train_xy = np.loadtxt(train_file, usecols=range(0,51),
delimiter=" ", comments="#", dtype=np.int64)
train_x = train_xy[:,0:50] # cols [0,50) = [0,49]
train_y = train_xy[:,50]
test_file = ".\\Data\\imdb_test_50w.txt"
test_xy = np.loadtxt(test_file, usecols=range(0,51),
delimiter=" ", comments="#", dtype=np.int64)
test_x = test_xy[:,0:50] # cols [0,50) = [0,49]
test_y = test_xy[:,50]
n_train = len(train_x)
n_test = len(test_x)
print("Num train = %d Num test = %d " % (n_train, n_test))
# -----------------------------------------------------------
# 2. define model
print("\nCreating LSTM binary classifier ")
lrn_rate = 0.001
opt_adam = K.optimizers.Adam(learning_rate=lrn_rate)
embed_vec_len = 32 # values per word -- 100-500 is typical
model = K.models.Sequential()
model.add(K.layers.Embedding(input_dim=129892,
output_dim=embed_vec_len )) # consider mask_zero=True
model.add(K.layers.LSTM(units=100)) # 100 memory
model.add(K.layers.Dropout(0.2))
model.add(K.layers.Dense(units=1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=opt_adam, metrics=['accuracy'])
# print(model.summary())
# -----------------------------------------------------------
# 3. train model
bat_size = 16
max_epochs = 30
print("\nbatch size = " + str(bat_size))
print("loss func = binary_crossentropy ")
print("optimizer = Adam ")
print("learn rate = %0.4f " % lrn_rate)
print("max_epochs = %d " % max_epochs)
my_logger = MyLogger(n=5)
print("\nStarting training ")
h = model.fit(train_x, train_y, epochs=max_epochs,
batch_size=bat_size, shuffle=True, verbose=0,
callbacks=[my_logger])
print("Training complete ")
# 4. evaluate model
eval = model.evaluate(test_x, test_y, verbose=0)
print("\nAccuracy on test data = %8.2f%%" % (eval[1]*100))
# 5. save model
print("\nSaving model to disk ")
# mp = ".\\Models\\imdb_model.h5"
# model.save(mp)
# 6. use model
print("\nFor \"the movie was a great waste of my time\"")
print("0 = negative, 1 = positive ")
review = np.array([4, 20, 16, 6, 86, 425, 7, 58, 64],
dtype=np.int64) # cheating . .
padding = np.zeros(50-len(review), dtype=np.int64)
review = np.concatenate([padding, review])
review = review.reshape(1, -1)
prediction = model.predict(review)
print("raw output : ", end="")
print("%0.4f " % prediction)
# -----------------------------------------------------------
print("\nEnd Keras IMDB LSTM sentiment demo")
if __name__ == "__main__":
main()

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.