Summary:

  • Eigenvalues and Eigenvectors
    • Eigenvectors are the vectors that does not change its orientation when multiplied by the transition matrix, but it just scales by a factor of corresponding eigenvalues.
  • Diagonalization & Eigendecomposition
    • A few applications of eigenvalues and eigenvectors that are very useful when handing the data in a matrix form because you could decompose them into matrices that are easy to manipulate.
  • Underlying assumption behind the diagonalization and eigendecomposition
    • Make sure that the matrix you are trying to decompose is a square matrix and has linearly independent eigenvectors (different eigenvalues).

Eigenvalue Problem and the Characteristic Polynomial

A non-zero vector of dim. is an eigenvector of square matrix

  • if it satisfies a linear equation of the form: ; for some scalar which we are solving for
    • This is called the eigenvalue equation/problem
    • Geometric intuition: The eigenvector(s) of are the vector(s) () which only elongates/shrinks, and never takes off it’s span(s).
      • The amount of this elongation/shrink is , a scalar value
  • Rearranging the eigenvalue problem:

  • The only way for to be possible (given non-zero ) is if
    • i.e. The matrix ) represents a linear transformation of the vector space which “reduces” its dimensionality (at least 1 dim is lost)
    • A matrix cannot squish non-zero vectors into the vector, except when their determinant is 0
  • By computing the determinant, we get the eigenvalues (1 for each dimension of the square matrix).
    • Computing requires solving a characteristic polynomial whose roots are the (s)

Why the observation matters

  • The observation that is only useful because solving it yields the eigenvalues s.
    • Those helps us solve for the eigenvectors ‘s (i.e. those vectors that this diagonally altered matrix “shrinks” to 0)
# A = np.random.randn(4,4)
# A = np.array([[3,1],
#              [0,2]])
 
import numpy as np
 
A = np.array([[4, 2, 2], [2, 4, 2], [2, 2, 4]])
# A = np.diag((1,2,3))
detA = np.linalg.det(A)
 
print("A:\n", A)
print("determinant(A):", detA)
 
eig_vals, eig_vecs = np.linalg.eig(A)
# eig_vals,eig_vecs = np.linalg.eigh(A)
# eig_vals,eig_vecs = scipy.linalg.eig(A)
 
print("\nEigenvalues - shape:", eig_vals.shape, "values:", np.round(eig_vals, 0))
print(
    "\nEigenvectors - shape:", eig_vecs.shape, "values:\n", np.round(eig_vecs.T, 2)
)  # not sure why this is so different to Wolfram and others
 
# Link 1: https://www.wolframalpha.com/input?i=eigenvectors+of+%7B%7B4%2C2%2C2%7D%2C%7B2%2C4%2C2%7D%2C%7B2%2C2%2C4%7D%7D
# Link 2: https://matrixcalc.org/vectors.html#eigenvectors%28%7B%7B4,2,2%7D,%7B2,4,2%7D,%7B2,2,4%7D%7D%29
 
# But let's test it against the eigenvalue problem: Av = λv (
print("Check against the Eigenvalue Problem Av = λv")
print("\nAv:\n", A @ eig_vecs)
print("\nλv:\n", eig_vals * eig_vecs)
 
A:
 [[4 2 2]
 [2 4 2]
 [2 2 4]]
determinant(A): 32.0
 
Eigenvalues - shape: (3,) values: [2. 8. 2.]
 
Eigenvectors - shape: (3, 3) values:
 [[-0.82  0.41  0.41]
 [ 0.58  0.58  0.58]
 [ 0.51 -0.81  0.3 ]]
Check against the Eigenvalue Problem Av = λv
 
Av:
 [[-1.63299316  4.61880215  1.01339709]
 [ 0.81649658  4.61880215 -1.61564839]
 [ 0.81649658  4.61880215  0.6022513 ]]
 
λv:
 [[-1.63299316  4.61880215  1.01339709]
 [ 0.81649658  4.61880215 -1.61564839]
 [ 0.81649658  4.61880215  0.6022513 ]]

Eigenbasis, Diagonalisation, and Eigen-decomposition

Eigenbasis

  • If our basis vectors () are themselves eigenvectors, it is called an eigenbasis.
  • Then if we inspect , the transformation matrix, it will have the form known as a Diagonal Matrix:

  • Its form is (diagonal) because recall only scales (stretch/shrink) its eigenvectors, which in this case are the basis vectors
    • It is very easy to compute large powers of diagonal matrices (they simply scale vectors by the eigenvalues)

Diagonalisation: Using the Eigenbasis to Diagonalise any non-diagonal

  1. Find the eigenvectors of
  2. Make a change of basis matrix whose columns are the eigenvectors of . We’ll use this to switch the coordinate system of
  3. Diagonalise to get by doing this:
  4. The new matrix is guaranteed to be diagonal, with its eigenvalues on the main diagonal

Derivation: Why is diagonalisation possible in the first place?

Show that

  • Suppose we have linearly independent eigenvectors of ;
    • then is a matrix, where each column is an eigenvector of ,
    • (final step because recall )
  • And so = which is

Assumptions:

The matrix you are trying to decompose must:

  • be a square matrix
  • have linearly independent eigenvectors (different eigenvalues, 1 for each row of the matrix)
print("Recall A:\n", A)
print("\nAnd as calculated earlier for A:")
print("\nEigenvalues:", np.round(eig_vals, 0))
print("\nEigenvectors:\n", np.round(eig_vecs.T, 2))
print("\n------\n")
 
# Define a change of basis matrix S, and then diagonalise A using Lambda = S^(-1) A S
S = eig_vecs
Lambda = np.linalg.inv(S) @ A @ S
 
print("Lambda = S^-1 @ A @ S:\n", np.round(Lambda, 2))
 
Recall A:
 [[4 2 2]
 [2 4 2]
 [2 2 4]]
 
And as calculated earlier for A:
 
Eigenvalues: [2. 8. 2.]
 
Eigenvectors:
 [[-0.82  0.41  0.41]
 [ 0.58  0.58  0.58]
 [ 0.51 -0.81  0.3 ]]
 
------
 
Lambda = S^-1 @ A @ S:
 [[ 2.  0. -0.]
 [ 0.  8. -0.]
 [-0. -0.  2.]]

Now that we know , we can do:

Diagonalisation:

  • Takes matrix to produce , a diagonal matrix with eigenvalues on the main diagonal
  • Multiply both sides by from the left.

Eigendecomposition

  • After diagonalising to get , we can use eigendecomposition to do quick matrix multiplications of
  • Multiply both sides by from the right

# Perform eigendecomposition to recover A from its eigenvectors and eigenvalues
A_eig_decomp = S @ Lambda @ np.linalg.inv(S)
print("A_eigendecomposed = S @ Lambda @ S^-1:\n", A_eig_decomp)
 
A_eigendecomposed = S @ Lambda @ S^-1:
 [[4. 2. 2.]
 [2. 4. 2.]
 [2. 2. 4.]]

Questions

  • When exactly do we use decompositions?
  • What is the intuition behind an eigenvalue and eigenvector?
  • Some interesting edge cases (what are the eigenvalues and eigenvectors for):
    • A rotation-only matrix like has only imaginary . No eigenvectors, as each vector is rotated
    • A shear matrix like : only 1 eigenvalue (), so only 1 eigenvector (all vectors on the -axis are eigenvectors)
    • A scaling-only matrix like : only 1 eigenvalue (), but all vectors in the plane are eigenvectors, and scaled by a factor of