Singular Value Decomposition From Scratch Using Python - Even Scratchier

Bottom line: I recently implemented a function to compute the singular value decomposition (SVD) of a matrix, from scratch, using Python. That implementation used many built-in NumPy functions such as np.diag(). I refactored the implementation to remove all those NumPy calls, making a version of SVD that’s even more from-scratch. This implementation uses the Householder + QR method. It works but it’s not as stable as my older version that uses the Jacobi algorithm. Put another way, the code in this blog post is interesting but not practical.

Computing the singular value decomposition (SVD) of a matrix is one of the most difficult problems in numerical programming. Many researchers worked for many years to develop a solution. The standard version of SVD (LAPACK) is over 10,000 lines of Fortran code!

I’ve plunked away at a from-scratch implementation of SVD for many years. When I got some free time recently, I was able to dedicate about 50 hours of my time for a pretty decent (estimated accuracy to about 1.0e-10 and reasonable performance on matrices up to about 1000 rows).

The first part of my demo shows an example of SVD on a tall (more rows than columns) matrix:

Begin SVD demo

TALL MATRIX

A =
[[ 1  2  3  4  5]
 [ 0 -3  5 -7  9]
 [ 2  0 -2  0 -2]
 [ 4 -1  5  6  1]
 [ 3  6  8  2  2]
 [ 5 -2  4 -4  3]]

U =
[[ 0.319267  0.295047  0.358160  0.465731  0.652040]
 [ 0.625260 -0.614481  0.189087  0.174205 -0.138090]
 [-0.129989  0.032917 -0.342862 -0.160369  0.572690]
 [ 0.284056  0.494504 -0.490845  0.494308 -0.381366]
 [ 0.486033  0.495867  0.263713 -0.648653 -0.103790]
 [ 0.416300 -0.209426 -0.638701 -0.248874  0.267559]]

s =
[15.967661 12.793149  6.297366  5.706879  2.478679]

Vt =
[[ 0.296544  0.035215  0.708796 -0.130799  0.625557]
 [ 0.217254  0.416871  0.261754  0.803400 -0.255056]
 [-0.745282  0.555722 -0.030757  0.039094  0.365040]
 [-0.187161 -0.609726 -0.196995  0.579569  0.467438]
 [ 0.523820  0.379983 -0.623971  0.003540  0.438033]]

reconstructed A
[[ 1.000000  2.000000  3.000000  4.000000  5.000000]
 [ 0.000000 -3.000000  5.000000 -7.000000  9.000000]
 [ 2.000000  0.000000 -2.000000  0.000000 -2.000000]
 [ 4.000000 -1.000000  5.000000  6.000000  1.000000]
 [ 3.000000  6.000000  8.000000  2.000000  2.000000]
 [ 5.000000 -2.000000  4.000000 -4.000000  3.000000]]

====================

np.linalg.svd results:

U =
[[-0.319267  0.295047  0.358160 -0.465731  0.652040]
 [-0.625260 -0.614481  0.189087 -0.174205 -0.138090]
 [ 0.129989  0.032917 -0.342862  0.160369  0.572690]
 [-0.284056  0.494504 -0.490845 -0.494308 -0.381366]
 [-0.486033  0.495867  0.263713  0.648653 -0.103790]
 [-0.416300 -0.209426 -0.638701  0.248874  0.267559]]

s =
[15.967661 12.793149  6.297366  5.706879  2.478679]

Vt =
[[-0.296544 -0.035215 -0.708796  0.130799 -0.625557]
 [ 0.217254  0.416871  0.261754  0.803400 -0.255056]
 [-0.745282  0.555722 -0.030757  0.039094  0.365040]
 [ 0.187161  0.609726  0.196995 -0.579569 -0.467438]
 [ 0.523820  0.379983 -0.623971  0.003540  0.438033]]

I validated the demo by using the np.linalg.svd() function to make sure I got the same results. Note that the sign of values in the columns of U and the rows of Vt are arbitrary for SVD.

For the second part of the demo, I used a wide (more columns than rows) matrix:

WIDE MATRIX

A =
[[ 3.000000  1.000000  1.000000  0.000000  5.000000]
 [-1.000000  3.000000  1.000000 -2.000000  4.000000]
 [ 0.000000  2.000000  2.000000  1.000000 -3.000000]]

U =
[[-0.727149  0.195146 -0.658158]
 [-0.621912 -0.593195  0.511219]
 [ 0.290654 -0.781049 -0.552705]]

s =
[7.639218 4.023860 3.232785]

Vt =
[[-0.204149 -0.263322 -0.100502  0.200868 -0.915716]
 [ 0.292911 -0.781970 -0.487131  0.100734  0.235122]
 [-0.768902 -0.071118 -0.387390 -0.487240  0.127506]]

reconstructed A
[[ 3.000000  1.000000  1.000000 -0.000000  5.000000]
 [-1.000000  3.000000  1.000000 -2.000000  4.000000]
 [ 0.000000  2.000000  2.000000  1.000000 -3.000000]]

====================

np.linalg.svd results:

U =
[[-0.727149 -0.195146 -0.658158]
 [-0.621912  0.593195  0.511219]
 [ 0.290654  0.781049 -0.552705]]

s =
[7.639218 4.023860 3.232785]

Vt =
[[-0.204149 -0.263322 -0.100502  0.200868 -0.915716]
 [-0.292911  0.781970  0.487131 -0.100734 -0.235122]
 [-0.768902 -0.071118 -0.387390 -0.487240  0.127506]]

End demo

The refactoring took me a long time. For example, the original call to the statement:

R[k:, k:] -= 2 * np.outer(v, v @ R[k:, k:])

uses complex slicing, matrix multiplication, the np.outer() function, and scalar multiplication. It was replaced by:

tmp1 = my_row_col_to_end(R, k)  # R[k:, k:]
tmp2 = my_vec_mat_product(v, tmp1)  # v @ R[k:, k:]
tmp3 = my_outer(v, tmp2)  # np.outer(v, v @ R[k:, k:])
B = my_scalar_mult(2.0, tmp3)
my_subtract_corner(R, k, B)  # modify R in-plce

Five statements that call five from-scratch functions.

I enjoyed the effort and learned a lot about SVD decomposition.

While I was refactoring my SVD code, I was faced with many situations where I had to decide to truncate code (to reduce number of lines) or not truncate (better clarity). Human limb truncation is usually not optional.

When I was growing up, my father was a doctor and my mother was a nurse. So my brother and sister and I grew up in a kind of medical environment. We learned early on that physical anomalies were nothing to be afraid of. Here are two amusing examples. Left: Clever math. Right: Clever tattoo.

Demo code. Caution: Almost certainly has a few minor bugs for edge cases. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols (my blog editor chokes on symbols).

# svd_from_scratchier.py

import numpy as np
import math

# -----------------------------------------------------------
# 31 helper functions to replace NumPy calls
# -----------------------------------------------------------

def my_norm(v):
  sum = 0.0
  for i in range(len(v)):
    sum += v[i] * v[i]
  return math.sqrt(sum)

def my_norm_matrix(A):
  m,n = A.shape
  sum = 0.0
  for i in range(m):
    for j in range(n):
      sum += A[i][j] * A[i][j]
  return math.sqrt(sum)

def my_diag(A):
  m,n = A.shape
  result = np.zeros(m)
  for i in range(m):
    result[i] = A[i][i]
  return result

def my_create_diag(v):
  # matrix from diag vector
  n = len(v)
  result = np.zeros((n,n))
  for i in range(n):
    result[i][i] = v[i]
  return result

def my_eye(n):
  result = np.zeros((n,n))
  for i in range(n):
    result[i][i] = 1.0
  return result

def my_copy_matrix(A):
  m,n = A.shape
  result = np.zeros((m,n))
  for i in range(m):
    for j in range(n):
      result[i][j] = A[i][j]
  return result 

def my_copy_vector(v):
  n = len(v)
  result = np.zeros(n)
  for i in range(n):
    result[i] = v[i]
  return result 

def my_col_to_end(A, k):
  # last part of col k, from k,k to end
  m,n = A.shape
  result = np.zeros(m-k)
  for i in range(0, m-k):
    result[i] = A[i+k][k]
  return result

def my_row_col_to_end(A, k):
  # row k to end, col k to end
  m,n = A.shape
  result = np.zeros((m-k,n-k))
  for i in range(0,m-k):
    for j in range(0,n-k):
      result[i][j] = A[i+k][j+k]
  return result

def my_vec_mat_product(v, A):
  m,n = A.shape
  result = np.zeros(n)
  for j in range(n):
    for k in range(m):
      result[j] += v[k] * A[k][j]
  return result

def my_mat_vec_product(A, v):
  m,n = A.shape
  result = np.zeros(m)
  for i in range(m):
    for k in range(n):
      result[i] +=A[i][k] * v[k]
  return result

def my_outer(v1, v2):
  result = np.zeros((v1.size, v2.size))
  for i in range(v1.size):
    for j in range(v2.size):
      result[i][j] = v1[i] * v2[j]
  return result

def my_scalar_mult(x, A):
  # x times each element A
  m,n = A.shape
  result = np.zeros((m,n))
  for i in range(m):
    for j in range(n):
      result[i][j] = x * A[i][j]
  return result 

def my_subtract_corner(A, k, C):
  # subtract elements in C from lower right A
  #  starting at [k][k]
  m,n = A.shape
  for i in range(0, m-k):
    for j in range(0, n-k):
      A[i+k][j+k] -= C[i][j]
  return  # in-place

def my_extract_all_rows_col_k_to_end(A, k):
  m,n = A.shape
  result = np.zeros((m,n-k))
  for i in range(m):
    for j in range(n-k):
      result[i][j] = A[i][j+k]
  return result

def my_subtract_all_rows_col_k_to_end(A, k, C):
  # subtract elements in C from right of A
  m,n = A.shape
  for i in range(0, m):
    for j in range(0, n-k):
      A[i][j+k] -= C[i][j] 

def my_extract_first_k_cols(A, k):
  m,n = A.shape
  result = np.zeros((m,k))
  for i in range(m):
    for j in range(k):
      result[i][j] = A[i][j]
  return result

def my_extract_first_k_rows(A, k):
  m,n = A.shape
  result = np.zeros((k,n))
  for i in range(k):
    for j in range(n):
      result[i][j] = A[i][j]
  return result

def my_mat_subtract(A, B):
  m,n = A.shape
  result = np.zeros((m,n))
  for i in range(m):
    for j in range(n):
      result[i][j] = A[i][j] - B[i][j]
  return result  
  
def my_mat_prod(A, B):
  a_rows = len(A)
  a_cols = len(A[0])
  b_rows = len(B)
  b_cols = len(B[0])
  result = np.zeros((a_rows, b_cols))
  for i in range(a_rows):
    for j in range(b_cols):
      for k in range(a_cols):
        result[i][j] += A[i][k] * B[k][j]
  return result

def my_mat_transpose(A):
  m,n = A.shape
  result = np.zeros((n,m))
  for i in range(m):
    for j in range(n):
      result[j][i] = A[i][j]
  return result

def my_argsort(v):
  n = len(v)
  augmented = []
  for i in range(n):
    tmp = [v[i],i]
    augmented.append(tmp)
  augmented.sort(key=lambda x: x[0])
  result = np.zeros(n, dtype=np.int64)
  for i in range(n):
    result[i] = augmented[i][1]
  return result 

def my_reversed(v):
  n = len(v)
  result = np.zeros(n, dtype=np.int64)
  for i in range(n):
    result[i] = v[n-1-i]
  return result 

def my_sort_using_idxs(v, idxs):
  n = len(v)
  result = np.zeros(n)
  for i in range(n):
    result[i] = v[idxs[i]]
  return result

def my_sort_cols_using_idxs(A, idxs):
  m,n = A.shape
  result = np.zeros((m,n))
  for i in range(m):
    for j in range(n):
      src_col = idxs[j]
      result[i][j] = A[i][src_col]
  return result
    
def my_clip_to_nonnegative(v):
  n = len(v)
  result = np.zeros(n)
  for i in range(n):
    if v[i] "lt" 0.0:
      result[i] = 0.0
    else:
      result[i] = v[i]
  return result

def my_sqrt(v):
  n = len(v)
  result = np.zeros(n)
  for i in range(n):
    result[i] = math.sqrt(v[i])
  return result

def my_indices_where_nonzero(v, tol):
  # 1 == not zero
  n = len(v)
  result = np.zeros(n, dtype=np.int64)
  for i in range(n):
    if v[i] "gt" tol:
      result[i] = 1  # non-zero
  return result

def my_remove_zeros(v, indices):
  # indices: 1 is non-zero, 0 is a near-zero
  n = len(v)
  lst = []
  for i in range(n):
    if indices[i] == 1:
      lst.append(v[i])
  result = np.array(lst)
  return result

def my_remove_cols(A, indices):
  m,n = A.shape
  ct_nonzeros = 0
  for i in range(len(indices)):
    if indices[i] == 1:
      ct_nonzeros += 1
  result = np.zeros((m, ct_nonzeros))

  k = 0  # destination col
  for j in range(n):
    if indices[j] == 0: continue # next src col
    for i in range(m):
      result[i][k] = A[i][j]
    k += 1 
  return result

def my_mat_div_vector(A, v):
  # A / v
  m,n = A.shape
  if n != len(v): print("FATAL in my_mat_div_vector")
  result = np.zeros((m,n))
  for i in range(m):
    for j in range(n):
      result[i][j] = A[i][j] / v[j]
  return result  

# -----------------------------------------------------------
# helpers for my_svd(): my_qr_decomp(), my_eigen()
# -----------------------------------------------------------

def my_qr_decomp(A):
  # reduced QR decomposition using Householder reflections
  # A: (m x n), m "gte" n
  # returns Q: (m x n), R: (n x n)
  # used by both my_eigen() and my_svd()

  # A = A.astype(float).copy() # assume A is float
  m, n = A.shape

  # Q = np.eye(m)
  Q = my_eye(m)
  # R = A.copy()
  R = my_copy_matrix(A)

  for k in range(n):
    # x = R[k:, k]
    x = my_col_to_end(R, k)
    # normx = np.linalg.norm(x)
    normx = my_norm(x)
    if normx == 0:
      continue

    # v = x.copy()
    v = my_copy_vector(x)
    sign = 1.0 if x[0] "gte" 0 else -1.0
    v[0] += sign * normx
    # v /= np.linalg.norm(v)
    v /= my_norm(v)

    # Apply reflector to R
    # R[k:, k:] -= 2 * np.outer(v, v @ R[k:, k:])
    tmp1 = my_row_col_to_end(R, k)  # R[k:, k:]
    tmp2 = my_vec_mat_product(v, tmp1)
    tmp3 = my_outer(v, tmp2)
    B = my_scalar_mult(2.0, tmp3)
    my_subtract_corner(R, k, B)

    # Accumulate Q
    # Q[:, k:] -= 2 * np.outer(Q[:, k:] @ v, v)
    tmp4 = my_extract_all_rows_col_k_to_end(Q, k)
    tmp5 = my_mat_vec_product(tmp4, v)
    tmp6 = my_outer(tmp5, v)
    C = my_scalar_mult(2.0, tmp6)
    my_subtract_all_rows_col_k_to_end(Q, k, C)

  # return Q[:, :n], R[:n, :]
  Q_part = my_extract_first_k_cols(Q, n)
  R_part = my_extract_first_k_rows(R, n)
  return Q_part, R_part

# -----------------------------------------------------------

def my_eigen(A, tol=1.0e-12, max_iter=2000):
  # eigen-decomp of symmetric matrix using QR iteration
  # returns eigenvalues and eigenvectors.

  # A = A.astype(float).copy()  # assume float
  AA = my_copy_matrix(A)
  n = AA.shape[0]

  # V = np.eye(n)
  V = my_eye(n)  # eigenvectors (columns)

  for _ in range(max_iter):
    # QR decomposition (using our own QR)
    Q, R = my_qr_decomp(AA)
    # A_new = R @ Q
    A_new = my_mat_prod(R, Q)
    # V = V @ Q
    V = my_mat_prod(V, Q)

    # Check convergence (off-diagonal norm)
    # off = A_new - np.diag(np.diag(A_new))
    off = \
      my_mat_subtract(A_new, my_create_diag(my_diag(A_new)))
    # if np.linalg.norm(off) "lt" tol:
    if my_norm_matrix(off) "lt" tol:
      AA = A_new
      break

    AA = A_new

  # eigvals = np.diag(A)
  eigvals = my_diag(AA)
  return eigvals, V

# -----------------------------------------------------------

def my_svd(A, tol=1.0e-12):
  # SVD using only NumPy operations:

  # A = np.asarray(A, float)  # assume A is float
  m, n = A.shape

  if m "gte" n:
    Q, R = my_qr_decomp(A)

    # symmetric eigenproblem
    # RtR = R.T @ R
    RtR = my_mat_prod(my_mat_transpose(R), R)
    eigvals, V = my_eigen(RtR, tol=tol)

    # sort large to small
    # idx = np.argsort(eigvals)[::-1]
    sort_idxs = my_reversed(my_argsort(eigvals))
    # eigvals = eigvals[idx]
    eigvals = my_sort_using_idxs(eigvals, sort_idxs)

    # V = V[:, idx]
    V = my_sort_cols_using_idxs(V, sort_idxs)

    # S = np.sqrt(np.clip(eigvals, 0, None))
    s = my_sqrt(my_clip_to_nonnegative(eigvals))

    # nonzero = S "gt" tol
    # U = Q @ (R @ V[:, nonzero] / S[nonzero])

    idxs = my_indices_where_nonzero(s, tol)
    s_nonzero = my_remove_zeros(s, idxs)
    V_nonzero = my_remove_cols(V, idxs)

    # U = Q @ ((R @ V_nonzero) / S_nonzero)
    # tmp1 = R @ V_nonzero
    tmp1 = my_mat_prod(R, V_nonzero)
    tmp2 = my_mat_div_vector(tmp1, s_nonzero)
    # U = Q @ tmp2
    U = my_mat_prod(Q, tmp2)

    # U = Q @ (R @ (V_nonzero / S_nonzero))

    # return U, S[nonzero], V[:, nonzero].T
    return U, s_nonzero, my_mat_transpose(V_nonzero)

  else:
    # Wide matrix: transpose trick
    # U, S, Vt = svd_qr_eig_numpy_only(A.T, tol)
    # return Vt.T, S, U.T
    U, s, Vt = my_svd(my_mat_transpose(A), tol)
    return my_mat_transpose(Vt), s, my_mat_transpose(U)
   

# ===========================================================

print("\nBegin SVD demo ")

print("\nTALL MATRIX ")

np.set_printoptions(precision=6,
  suppress=True, floatmode='fixed')

A = np.array([[3, 1, 1],
              [-1, 3, 1],
              [0, 2, 2],
              [4, 4, -2]])

A = np.array([[1, 2, 3, 4, 5],
              [0,-3, 5, -7, 9],
              [2, 0, -2, 0, -2],
              [4, -1, 5, 6, 1],
              [3, 6, 8, 2, 2],
              [5, -2, 4, -4, 3]])

print("\nA = ");print(A)

U, s, Vt = my_svd(A)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

A_reconstructed = my_mat_prod(U, \
  my_mat_prod(my_create_diag(s), Vt))
print("\nreconstructed A"); print(A_reconstructed)

print("\n====================")

print("\nnp.linalg.svd results: ")
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

# ===========================================================

print("\nWIDE MATRIX ")

A = np.array([[3, 1, 1, 0, 5],
              [-1, 3, 1, -2, 4],
              [0, 2, 2, 1, -3]], dtype=np.float64)

print("\nA = ");print(A)

U, s, Vt = my_svd(A)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

A_reconstructed = my_mat_prod(U, \
  my_mat_prod(my_create_diag(s), Vt))
print("\nreconstructed A"); print(A_reconstructed)

print("\n====================")

print("\nnp.linalg.svd results: ")
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print("\nU = "); print(U)
print("\ns = "); print(s)
print("\nVt = "); print(Vt)

print("\nEnd demo ")