Just for fun, while I was eating breakfast one morning, I decided to code up an LSTM cell using Python. So I did.
An LSTM cell is a complex software module that accepts input (as a vector), generates output, and maintains cell state. If you connect an LSTM cell with some additional plumbing, you get an LSTM network. These networks can be used with sequence data, such as a sequence of words in a sentence.
I used as my base reference the description given in the Wikipedia entry on the topic. There are many, many variations of LSTMs, and I used the simplest.
It was a good exercise and reinforced my understanding of LSTMs and NumPy dot() function (matrix multiplication), multiply() function (Hadamard, matrix element-wise multiplication), addition function (element-wise addition which is implemented with add() or the overloaded ‘+’ operator).
LSTM are very interesting. At some point I’ll take a stab at hooking up a full LSTM network, and then training the LSTM network, which will not be a trivial task.
# lstm_io.py
import numpy as np
np.set_printoptions(precision=4)
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def compute_outputs(xt, h_prev, c_prev,
Wf, Wi, Wo, Wc,
Uf, Ui, Uo, Uc,
bf, bi, bo, bc):
ft = sigmoid(np.dot(Wf,xt) + np.dot(Uf,h_prev) + bf)
it = sigmoid(np.dot(Wi,xt) + np.dot(Ui,h_prev) + bi)
ot = sigmoid(np.dot(Wo,xt) + np.dot(Uo,h_prev) + bo)
ct = np.multiply(ft, c_prev) + \
np.multiply(it, np.tanh(np.dot(Wc,xt) + \
np.dot(Uc, h_prev) + bc))
ht = np.multiply(ot, np.tanh(ct))
return (ht, ct)
# =========================================================
def main():
print("\nBegin LSTM demo\n")
xt = np.array([[1.0], [2.0]], dtype=np.float32)
h_prev = np.zeros(shape=(3,1), dtype=np.float32)
c_prev = np.zeros(shape=(3,1), dtype=np.float32)
W = np.array([[0.01, 0.02],
[0.03, 0.04],
[0.05, 0.06]], dtype=np.float32)
U = np.array([[0.07, 0.08, 0.09],
[0.10, 0.11, 0.12],
[0.13, 0.14, 0.15]], dtype=np.float32)
b = np.array([[0.16], [0.17], [0.18]], dtype=np.float32)
Wf = np.copy(W); Wi = np.copy(W)
Wo = np.copy(W); Wc = np.copy(W)
Uf = np.copy(U); Ui = np.copy(U)
Uo = np.copy(U); Uc = np.copy(U)
bf = np.copy(b); bi = np.copy(b)
bo = np.copy(b); bc = np.copy(b)
print("Sending input = (1.0, 2.0) \n")
(ht, ct) = compute_outputs(xt, h_prev, c_prev, Wf, Wi,
Wo, Wc, Uf, Ui, Uo, Uc, bf, bi, bo, bc)
print("output = ")
print(ht)
print("")
print("new cell state = ")
print(ct)
print("\n")
h_prev = np.copy(ht)
c_prev = np.copy(ct)
xt = np.array([[3.0], [4.0]], dtype=np.float32)
print("Sending input = (3.0, 4.0) \n")
(ht, ct) = compute_outputs(xt, h_prev, c_prev, Wf, Wi,
Wo, Wc, Uf, Ui, Uo, Uc, bf, bi, bo, bc)
print("output = ")
print(ht)
print("")
print("new cell state = ")
print(ct)
print("\nEnd \n")
if __name__ == "__main__":
main()


.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
Yesterday I tried to use the C# example you gave of this..
For predictions, its a memory but how would you use this inside airplane traffic data ?
Does this learn a adjusting pattern over time, or is it rather a likely state decision model.
ea state input state xxx some number result categoried towards a=64, B=16, C=73 and state e = 40 ,
I have a related question to this model and prediction models, how to create and use loop-backs and signal delay in time series neural networks. (not for trading but personal medical reasons)