Exploring Non-Negative Matrix Factorization From Scratch Using Python

If you have a source matrix V with shape (m,n) where all cells are non-negative, and you specify a k value, the goal of non-negative matrix factorization (NMF) is to find a matrix W with shape (m,k) and a matrix H with shape (k,n) so that W * H is close to V (but not necessarily exactly equal to V).

I put together a demo of NMF using from-scratch Python. For my demo, I created a (4,5) source matrix V:

[[ 2.0  4.0  6.0  8.0 10.0]
 [ 4.0  8.1 12.0 16.0 19.9]
 [ 1.0  1.9  3.0  4.0  4.9]
 [ 5.0 11.0 17.1 23.0 29.0]]

Important Note: The non-negative matrix factorization technique assumes that the source matrix V has some sort of underlying structure. My demo V has row 2 that is about twice row 1, row 3 is about half of row 1, row 3 is about three times row 1 minus 1.

My first explorations of NMF used a V matrix with random values and the results were terrible in the sense that W*H was not very close to the source V.

Put another way, contrary to most of the information I found on the Internet, NMF is factorization for a specific scenario; NMF is absolutely not a general purpose matrix factorization technique.


My from-scratch implementation is based directly on the algorithm given in the nice Wikipedia article at https://en.wikipedia.org/wiki/Non-negative_matrix_factorization. There are several alternative algorithms for NMF.

The resulting W and H matrices are:

W =
[[0.3506 0.1713]
 [0.7772 0.3054]
 [0.1614 0.0901]
 [0.2606 0.8463]]

H =
[[ 3.2376  6.0156  8.5402 11.2752 13.8126]
 [ 4.9139 11.1413 17.5771 23.7062 30.0129]]

The matrix product W * H is:

W * H =
[[ 1.9768  4.0175  6.0050  8.0137  9.9836]
 [ 4.0169  8.0777 12.0052 16.0026 19.9005]
 [ 0.9654  1.9750  2.9625  3.9563  4.9343]
 [ 5.0023 10.9965 17.1011 23.0009 28.9995]]

which is quite close to the source matrix V.

I computed a measure of error between W * H and the source V as the square root of the sum of the squared differences between cell values (the “Frobenius norm”):

error = 0.1161

Note that there are many, many possible W and H results. The main goal is to find a pair of W and H matrices so that W * H is close to the source V matrix. Note also, that I’m thinking of the source matrix V as having “samples” as the rows and “features” as the columns. In most of the literature, the meanings are reversed.

To validate my from-scratch implementation, I called the sklearn.decomposition.NMF module of the scikit-learn library. The W and H matrices were very different but the W * H product was the same to three decimals and the error was identical:

Start compute NMF using scikit
Done

W =
[[0.9017 1.6828]
 [1.6636 3.5865]
 [0.4659 0.7967]
 [3.9098 2.6497]]

H =
[[0.7590 1.8759 3.0707 4.1699 5.3333]
 [0.7680 1.3821 1.9230 2.5277 3.0749]]

W * H =
[[ 1.9767  4.0174  6.0050  8.0137  9.9836]
 [ 4.0169  8.0777 12.0052 16.0026 19.9004]
 [ 0.9654  1.9750  2.9625  3.9563  4.9343]
 [ 5.0023 10.9965 17.1011 23.0009 28.9995]]

error = 0.1161

Different seed values (for my from-scratch version and the scikit version) will give significantly different W and H results, with slightly different error values. Therefore, an approach that is sometimes used is to call MNF factorization using several different seed values, and return the W and H matrices that give the smallest error.


The algorithm on the Wikipedia article isn’t as complicated as it first appears.

Non-negative matrix factorization is a surprisingly complex and interesting topic. The Wikipedia article describes applications using astronomy data, text mining, spectral data analysis, speech denoising, population genetics, and bioinformatics.



Matrix factorization is also called matrix decomposition. I’m a big fan of early 1950s science fiction movies. It wasn’t a good time to be in the military then, because aliens always seemed to have decomposition/disintegration ray weapons.

Left: In “War of the Worlds” (1953), Earth’s first contact with the Martians doesn’t go well. The Martians had a green decomposition/disintegration ray and a red heat ray. A very good movie. I give it my personal B+ grade.

Center: In “The Atomic Submarine” (1959), an alien underwater saucer is wreaking havoc until the submarine USS Tigershark rams it and some of the crew are able to enter the saucer. Unfortunately for them, the alien on board has a decomposition/disintegration ray. But the good guys win in the end. This is objectively not a good movie, but I like it a lot. Fantastic sound effects. I give it my personal B grade.

Right: In “Earth vs. the Flying Saucers” (1956), the story is pretty much what you’d expect from the title. The aliens have a powerful decomposition/disintegration weapon but Earth scientists develop a sonic weapon that defeats the invaders. I give this movie my personal B grade.


Demo code:

# nmf_demo.py

# non-negative matrix factorization
# based on:
# en.wikipedia.org/wiki/Non-negative_matrix_factorization

import numpy as np

# -----------------------------------------------------------

def update_H(W, H, V):
  N = np.matmul(W.T, V)  # numerator
  D = np.matmul(np.matmul(W.T, W), H)  # denominator
  result = np.multiply(H, np.divide(N, D))
  return result

def update_W(W, H, V):
  N = np.matmul(V, H.T)  # numerator
  D = np.matmul(np.matmul(W, H), H.T)
  result = np.multiply(W, np.divide(N, D))
  return result

def nmf_compute(V, k, max_iter):
  rnd = np.random.RandomState(2)  # could pass seed as param
  (m,n) = V.shape
  W = rnd.random((m,k))
  H = rnd.random((k,n))
  for i in range(max_iter):
    H = update_H(W, H, V)
    W = update_W(W, H, V)
    if i % (max_iter // 5) == 0:
      err = error(W, H, V)
      print("iteration = %4d  error = %0.4f " % (i, err))
  return (W, H)

def error(W, H, V):
  # "Frobenius norm" of diff between V and W*H
  err = np.sqrt(np.sum((V - np.matmul(W,H))**2))
  return err

# -----------------------------------------------------------
# here's a version that's self-contained (no helpers)
# call like (W, H, err) =  nmf_compute2(V, 2, 10000, seed=1)
# -----------------------------------------------------------

def nmf_compute2(V, k, max_iter, seed):
  rnd = np.random.RandomState(seed)
  (m,n) = V.shape
  W = rnd.random((m,k))
  H = rnd.random((k,n))
  for i in range(max_iter):
    # H = update_H(W, H, V)
    N = np.matmul(W.T, V)  # numerator
    D = np.matmul(np.matmul(W.T, W), H)  # denominator
    H = np.multiply(H, np.divide(N, D))
    # W = update_W(W, H, V)
    N = np.matmul(V, H.T)  # numerator
    D = np.matmul(np.matmul(W, H), H.T)
    W = np.multiply(W, np.divide(N, D))

    if i % (max_iter // 5) == 0:
      # err = error(W, H, V)
      err = np.sqrt(np.sum((V - np.matmul(W,H))**2))
      print("iteration = %4d  error = %0.4f " % (i, err))
  err = np.sqrt(np.sum((V - np.matmul(W,H))**2))
  return (W, H, err)

# -----------------------------------------------------------

def main():
  print("\nBegin NMF from scratch demo ")
  np.random.seed(0)
  np.set_printoptions(precision=1, suppress=True,
    floatmode='fixed')

  m = 4 
  n = 8  # V is (m,n)
  k = 2  # W = (m,k), H = (k,n), V = W*H

  V = [[2.0,  4.0,  6.0,  8.0, 10.0],
       [4.0,  8.1, 12.0, 16.0, 19.9],
       [1.0,  1.9,  3.0,  4.0,  4.9],
       [5.0, 11.0, 17.1, 23.0, 29.0]]
  V = np.array(V)
  print("\nV = ")
  print(V)

  np.set_printoptions(precision=4)
  print("\nStart compute NMF from scratch ")
  (W, H) = nmf_compute(V, k, 10000)
  # (W, H, err) = nmf_compute2(V, k, 10000, seed=1)
  print("Done ")

  print("\nW = ")
  print(W)
  print("\nH = ")
  print(H)

  print("\nW * H = ")
  WH = np.matmul(W, H)
  print(WH) 

  err = error(W, H, V)
  print("\nerror = %0.4f " % err)

  print("\nStart compute NMF using scikit ")
  from sklearn.decomposition import NMF
  model = NMF(n_components=2, max_iter=10000,
    init='random', random_state=2, tol=0.0)
  W = model.fit_transform(V)
  H = model.components_
  print("Done ")

  print("\nW = ")
  print(W)
  print("\nH = ")
  print(H)

  print("\nW * H = ")
  WH = np.matmul(W, H)
  print(WH)

  err = model.reconstruction_err_
  print("\nerror = %0.4f " % err)

  print("\nEnd demo ")

if __name__ == "__main__":
  main()
This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply