Computing the Coefficients of the Characteristic Polynomial of a Matrix using Polynomial Expansion of the Eigenvalues with Python

Suppose you have some square n-by-n matrix A, and suppose you want to calculate the coefficients of the characteristic polynomial of A. For example, if A =

 1.0  2.0  6.0
 3.0  4.0  5.0
 0.0  3.0  7.0

then the coefficients of the characteristic polynomial of A are (1.0, -12.0, 18.0, -25.0). The coefficients can be used for various purposes, such as computing the matrix inverse of A using the Cayley-Hamilton technique.

But how to compute the coefficients of the characteristic polynomial of a matrix? If you search the Internet, you’ll see that by far the most common technique described is called the Faddeev–LeVerrier algorithm. However, the Faddeev–LeVerrier algorithm is highly unstable and explodes for even small matrices with all positive values because it computes A*A*A*..*A.

Another way to compute the coefficients is called polynomial expansion of the eigenvalues. Suppose that for some source matrix A, you compute the eigenvalues of A (using SVD or QR) and they are (2.0, -3.0, 4.0, 5.0). If you set p(x) = (x-2)*(x+3)*(x-4)*(x-5) and expand using FIFO, you get:

p(x) = (x-2)*(x+3)*(x-4)*(x-5)
     = x^4 - 8x^3 + 5x^2 + 74x - 120

This gives you the coefficients of the characteristic polynomial of A: (1, -8, 5, 74, -120). Remarkable.

That’s fine but I did a thorough search of the Internet and found no examples of how to actually perform the polynomial expansion operation. So I set out to do so. The result is a function that accepts the eigenvalues of a matrix, and uses them to compute the coefficients of the characteristic polynomial of the matrix:

def my_poly_from_eigenvals(eigen_vals):
  # polynomial expansion algorithm
  # eigenvalues can be complex even if matrix is real
  n = len(eigen_vals)
  old = np.zeros(n+1, dtype="complex_")
  new = np.zeros(n+1, dtype="complex_")

  # hard-init first three cells
  old[0] = 1.0
  old[1] = -1 * (eigen_vals[0] + eigen_vals[1])
  old[2] = eigen_vals[0] * eigen_vals[1]

  for i in range(n-2):
    curr_eigen = eigen_vals[i+2]
    new[0] = 1.0
    for j in range(1, i+3):
      new[j] = old[j] + (-1 * old[j-1] * curr_eigen)
    new[i+3] = old[i+2] * (-1 * curr_eigen)
  
    for k in range(n+1):
      old[k] = new[k]  # old = new
  return new.real

The code is short but quite tricky (it took me over four hours of thinking and experimenting and coding to implement). Then this function can be used to create a function that accepts a matrix and computes the coefficients of the characteristic polynomial:

def my_poly_expansion(A):
  # polynomial expansion algorithm
  (evals, _) = np.linalg.eig(A)
  return my_poly_from_eigenvals(evals)

Here’s the output of a demo where I validate my function against the built-in numpy.poly() function:

Begin coeffs of characteristic polynomial using
  polynomial expansion of eigenvalues

Setting eigenvalues = 
[ 2.0 -3.0  4.0  5.0]

Coefficients of characteristic polynomial 
  via my_poly_from_eigenvals():
[   1.0   -8.0    5.0   74.0 -120.0]

**********

Source matrix:
[[1.0 2.0 6.0]
 [3.0 4.0 5.0]
 [0.0 3.0 7.0]]

Coefficients via my_poly_expansion()
[  1.0 -12.0  18.0 -25.0]

Coefficients via my_poly_faddeev()
[  1.0 -12.0  18.0 -25.0]

Coefficients via built-in np.poly()
[  1.0 -12.0  18.0 -25.0]

End demo

A super interesting exploration, and a minor, but nice, new contribution (I think) to the Internet resources on computing the characteristic polynomial of a square matrix.



I get great pleasure from designing and implementing new algorithms. All the artists I know say they get great pleasure from creating art. The key idea here is that creating something — anything — is good.

An artist who calls himself “Batjorge” creates wonderful fractal images of simulated alien life using the Mandelbulber tool. According to the Internet, Batjorge lives in Tenerife, Canary Islands, Spain. I visited Tenerife several times when I worked as an assistant cruise director for the Royal Viking line in the 1980s. Tenerife is beautiful but isolated.


Demo code:

# characteristic.py
# coefficients of characteristic polynomial of a matrix via 
# polynomial expansion of eigenvalues

import numpy as np

np.set_printoptions(precision=1, suppress=True,
  floatmode='fixed')

# -----------------------------------------------------------

def my_poly_from_eigenvals(eigen_vals):
  # polynomial expansion algorithm
  # eigenvalues can be complex even if matrix is real
  n = len(eigen_vals)
  old = np.zeros(n+1, dtype="complex_")
  new = np.zeros(n+1, dtype="complex_")

  # hard-init first three cells
  old[0] = 1.0
  old[1] = -1 * (eigen_vals[0] + eigen_vals[1])
  old[2] = eigen_vals[0] * eigen_vals[1]

  for i in range(n-2):
    curr_eigen = eigen_vals[i+2]
    new[0] = 1.0
    for j in range(1, i+3):
      new[j] = old[j] + (-1 * old[j-1] * curr_eigen)
    new[i+3] = old[i+2] * (-1 * curr_eigen)
  
    for k in range(n+1):
      old[k] = new[k]  # old = new
  return new.real

# -----------------------------------------------------------

def my_poly_expansion(A):
  # polynomial expansion algorithm
  (evals, _) = np.linalg.eig(A)
  return my_poly_from_eigenvals(evals)

# -----------------------------------------------------------

def my_poly_faddeev(A):
  # coeffs of characteristic polynomial of A
  # Faddeev–LeVerrier algorithm
  n = len(A)
  c = np.zeros(n+1, dtype=np.float64)  # result
  c[0] = 1.0
  Ak = np.copy(A)
  for k in range(1, n+1):
    ck = -1 * Ak.trace() / k
    c[k] = ck
    for i in range(n):
      Ak[i][i] += ck
    Ak = np.matmul(A, Ak)  # unstable!
  return c

# -----------------------------------------------------------

print("\nBegin coeffs of characteristc polynomial " + \
  "using polynomial expansion of eigenvalues ")

# ex: (x-2)(x+3)(x-4)(x-5)
eigen_vals = np.array([2, -3, 4, 5], dtype=np.float64)
print("\nSetting eigenvalues: ")
print(eigen_vals)
coeffs = my_poly_from_eigenvals(eigen_vals)
print("\nCoefficients of characteristic polynomial " + \
  "via my_poly_from_eigenvals(): ")
print(coeffs)

print("\n********** ")

A = np.array([
  [1.0, 2.0, 3.0, 1.0, 5.0],
  [0.0, 5.0, 4.0, 1.0, 4.0],
  [6.0, 1.0, 0.0, 2.0, 2.0],
  [1.0, 4.0, 5.0, 3.0, 2.0],
  [0.0, 2.0, 4.0, 0.0, 1.0]], dtype=np.float64)

B = np.array([[1., 2., 6.],
              [3., 4., 5.],
              [0., 3., 7.]])

print("\nSource matrix: ")
print(A)

coeffs = my_poly_expansion(A)
print("\nCoefficients via my_poly_expansion() ")
print(coeffs)

coeffs = my_poly_faddeev(A)
print("\nCoefficients via my_poly_faddeev() ")
print(coeffs)

coeffs = np.poly(A)
print("\nCoefficients via built-in np.poly() ")
print(coeffs)

print("\nEnd demo ")
This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply