Understanding LSTM Neural Networks – for Mere Mortals

One of my go-to sources for news about machine learning and AI is PureAI. See https://pureai.com. I helped out on a recent article titled “Understanding LSTM Neural Networks – for Mere Mortals”. See https://pureai.com/articles/2019/11/14/lstm-mere-mortals.aspx.

The motivation for the article is that although LSTM neural networks have become a standard tool, most explanations are at either too low a level (hundreds of lines of complex code) or too high a level (all fluff and no content that’s useful in a practical way). So the PureAI editors wanted to create an explanation of LSTM networks that’s not too detailed but not too vague. I think they (well, “we” because I helped) succeeded.

LSTM (long, short-term memory) neural networks have become a standard tool for creating practical prediction systems. The article explains briefly what type of problems LSTMs can and cannot solve, describe how LSTMs work, and discuss issues related to implementing an LSTM prediction system in practice.


The article explains this diagram of how LSTMs work in a simple and only semi-technical way.

At a high level of abstraction, LSTM networks can be used for natural language processing (NLP). A specific example where LSTMs are highly effective is sentiment analysis – predicting if a sentence or paragraph is negative (“Beyond comprehension”), neutral (“I rate it so-so”), or positive (“Unbelievable service”). Notice that all three mini-examples are somewhat ambiguous.

LSTM networks are often extremely good at NLP problems that predict a single output, such as a sentiment of negative, positive, or neutral. But LSTM networks have been less successful at predicting multiple output values. For example, you can create an LSTM network that translates from English to Spanish. The input is a sequence of English words and the output would be a sequence of Spanish words. For such sequence-to-sequence problems, even though LSTM networks can be successful, a new type of network called attention Transformers are being increasingly used.



Three colorful illustrations of Egyptian goddesses by an artist named Yliade. From left, Serqet, Isis, Nuut.

This entry was posted in Machine Learning. Bookmark the permalink.

2 Responses to Understanding LSTM Neural Networks – for Mere Mortals

  1. Thorsten Kleppe's avatar Thorsten Kleppe says:

    A very interesting article, thanks for the trip.

    I was wondering if you would make another topic for your book “Neural Networks with JavaScript Succinctly”, more than an announcement was not here? I am glad to have read the good book, maybe others who have missed the announcement should also be lucky.

  2. Peter Boos's avatar Peter Boos says:

    In general i am kinda disapointed in LSTM’s, they babble nonsens most of the time.
    These days they’re surpassed by other type of neural networks.
    Here is a link to one of the best ‘modern’ text generating neural net, give it a few words and it continous text almost real text content : https://talktotransformer.com/

    But despite my LSTM disapointment (for which i once bought a book) the article is OK, i hope you like the link as well you should surely see that in action.

Comments are closed.