Sentiment Analysis Using a PyTorch LSTM

I’ve been poking away for many weeks on the problem of sentiment analysis using a PyTorch LSTM (long short-term memory) network. Specifically, my ultimate goal is to create a prediction model for the IMDB movie review dataset.

As it turns out, this is an extremely challenging problem, both conceptually and technically. One of the big stumbling blocks is that the IMDB dataset is large (50,000 text reviews) and so loading the data into memory takes several minutes, and so doing anything just takes a long time. Therefore, I decided to experiment with a tiny dummy dataset:

  train_x = {
    "{pad} {pad} the movie was excellent".split(),
    "{pad} this film was a disaster".split(),
    "{pad} not good and not bad".split(),
    "i liked this movie a lot".split(),
    "{pad} a waste of my time".split(),
    "{pad} {pad} {pad} time well spent".split(),
    "{pad} {pad} {pad} a decent movie".split(),
    "{pad} {pad} {pad} a great movie".split(),
    "{pad} i rate this movie average".split(),
    "{pad} i rate this film bad".split()
  ]

  train_y = T.tensor([[2],{0],{1],
                      [2],{0],{2],
                      [1],{2],{1],{0]], dtype=T.long)

I use 0 for a negative review, 1 for a neutral/average review, and 2 for a positive review. After many hours of experimenting, I finally got a rudimentary system up and running. My demo is by no means useful for real data, but it represents a big milestone for me. In my demo, the prediction for a new review of “i rate this film excellent” is (0.3835, 0.2031, 0.4134) and because the largest pseudo-probability is 0.4134 at index [2], the prediction is “positive review”.

The demo program isn’t very long — about 120 lines of code — but the code is extraordinarily complex. Even so, there are many enhancements that could be added to the demo. For example, the demo uses online training rather than batch training, uses a single-direction LSTM rather than bidirectional, and treats the special “{pad}” word as a regular word.

But for difficult problems, it’s best to take one step at a time and this demo was a big step.



Concept art from the 1960s for the design of the Pirates of the Caribbean ride at Disneyland. By artist Marc Davis. I worked at Disneyland when I was in college, and sometimes worked on the Pirates ride. Still an amazingly complex system and one of many people’s favorite attractions more than 50 years after opening in 1967.

This entry was posted in Machine Learning, PyTorch. Bookmark the permalink.