Prev: matrix-decompositions

Summary:

Eigenvalues and Eigenvectors
- Eigenvectors are the vectors that does not change its orientation when multiplied by the transition matrix, but it just scales by a factor of corresponding eigenvalues.
Diagonalization & Eigendecomposition
- A few applications of eigenvalues and eigenvectors that are very useful when handing the data in a matrix form because you could decompose them into matrices that are easy to manipulate.
Underlying assumption behind the diagonalization and eigendecomposition
- Make sure that the matrix you are trying to decompose is a square matrix and has linearly independent eigenvectors (different eigenvalues).

Eigenvalue Problem and the Characteristic Polynomial

A non-zero vector $v$ of dim. $R^{n}$ is an eigenvector of square matrix $A : A \in R^{n \times n}$

if it satisfies a linear equation of the form: $Av = λ v$ ; for some scalar $λ$ which we are solving for
- This is called the eigenvalue equation/problem
- Geometric intuition: The eigenvector(s) of $A$ are the vector(s) ( $v$ ) which $A$ only elongates/shrinks, and never takes off it’s span(s).
  - The amount of this elongation/shrink is $λ$ , a scalar value
Rearranging the eigenvalue problem:

Av Av (A - λ I) v = λ v = (λ I) v = λ 0 ⋮ 0 0 λ ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ λ v_{1} v_{2} ⋮ v_{n} = 0

$Since v \neq = 0, solve for:$

p (λ) = det (A - λ I) = 0

The only way for $(A - λ I) v = 0$ to be possible (given non-zero $v$ ) is if $det (A - λ I) = 0$
- i.e. The matrix $(A - λ I$ ) represents a linear transformation of the vector space which “reduces” its dimensionality (at least 1 dim is lost)
- A matrix cannot squish non-zero vectors into the $0$ vector, except when their determinant is 0
By computing the determinant, we get the eigenvalues $λ_{1}, λ_{2} (, ..., λ_{n})$ (1 for each dimension of the square matrix).
- Computing $det (A - λ I)$ requires solving a characteristic polynomial whose roots are the $λ$ (s)

Why the $det (A - λ I) = 0$ observation matters

The observation that $det (A - λ I) \equiv 0$ is only useful because solving it yields the eigenvalues $λ$ s.
- Those helps us solve for the eigenvectors $v$ ‘s (i.e. those vectors that this diagonally altered matrix $(A - λ I)$ “shrinks” to 0)

# A = np.random.randn(4,4)
# A = np.array([[3,1],
#              [0,2]])
 
import numpy as np
 
A = np.array([[4, 2, 2], [2, 4, 2], [2, 2, 4]])
# A = np.diag((1,2,3))
detA = np.linalg.det(A)
 
print("A:\n", A)
print("determinant(A):", detA)
 
eig_vals, eig_vecs = np.linalg.eig(A)
# eig_vals,eig_vecs = np.linalg.eigh(A)
# eig_vals,eig_vecs = scipy.linalg.eig(A)
 
print("\nEigenvalues - shape:", eig_vals.shape, "values:", np.round(eig_vals, 0))
print(
    "\nEigenvectors - shape:", eig_vecs.shape, "values:\n", np.round(eig_vecs.T, 2)
)  # not sure why this is so different to Wolfram and others
 
# Link 1: https://www.wolframalpha.com/input?i=eigenvectors+of+%7B%7B4%2C2%2C2%7D%2C%7B2%2C4%2C2%7D%2C%7B2%2C2%2C4%7D%7D
# Link 2: https://matrixcalc.org/vectors.html#eigenvectors%28%7B%7B4,2,2%7D,%7B2,4,2%7D,%7B2,2,4%7D%7D%29
 
# But let's test it against the eigenvalue problem: Av = λv (
print("Check against the Eigenvalue Problem Av = λv")
print("\nAv:\n", A @ eig_vecs)
print("\nλv:\n", eig_vals * eig_vecs)

A:
 [[4 2 2]
 [2 4 2]
 [2 2 4]]
determinant(A): 32.0
 
Eigenvalues - shape: (3,) values: [2. 8. 2.]
 
Eigenvectors - shape: (3, 3) values:
 [[-0.82  0.41  0.41]
 [ 0.58  0.58  0.58]
 [ 0.51 -0.81  0.3 ]]
Check against the Eigenvalue Problem Av = λv
 
Av:
 [[-1.63299316  4.61880215  1.01339709]
 [ 0.81649658  4.61880215 -1.61564839]
 [ 0.81649658  4.61880215  0.6022513 ]]
 
λv:
 [[-1.63299316  4.61880215  1.01339709]
 [ 0.81649658  4.61880215 -1.61564839]
 [ 0.81649658  4.61880215  0.6022513 ]]

Eigenbasis, Diagonalisation, and Eigen-decomposition

Eigenbasis

If our basis vectors ( $\hat{i} = [10], \hat{j} = [01], \dots$ ) are themselves eigenvectors, it is called an eigenbasis.
Then if we inspect $A$ , the transformation matrix, it will have the form known as a Diagonal Matrix:

$A_{diag} = [λ_{1} 0 0 λ_{2}]$

Its form is (diagonal) because recall $A$ only scales (stretch/shrink) its eigenvectors, which in this case are the basis vectors
- It is very easy to compute large powers of diagonal matrices (they simply scale vectors by the eigenvalues)

Diagonalisation: Using the Eigenbasis to Diagonalise any non-diagonal $A$

Find the eigenvectors of $A$
Make a change of basis matrix $S$ whose columns are the eigenvectors of $A$ . We’ll use this to switch the coordinate system of $A$
Diagonalise $A$ to get $Λ$ by doing this: $Λ = S^{- 1} AS$
The new matrix $Λ$ is guaranteed to be diagonal, with its eigenvalues on the main diagonal

Derivation: Why is diagonalisation $Λ = S^{- 1} AS$ possible in the first place?

Show that $AS=S Λ$

Suppose we have $m$ linearly independent eigenvectors of $A$ ;
- then $S$ is a matrix, where each column is an eigenvector of $A$ , $v_{1 \dots m}$
- $AS = A [v_{1} v_{2} \dots v_{m}] = [A v_{1} A v_{2} \dots A v_{m}] = [λ_{1} v_{1} λ_{2} v_{2} \dots λ_{m} v_{m}]$ (final step because recall $Av = λ$ )
And so $[λ_{1} v_{1} λ_{2} v_{2} \dots λ_{m} v_{m}]$ = $[v_{1} v_{2} \dots v_{m}] λ_{1} 0 ⋮ 0 0 λ_{2} ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ λ_{m}$ which is $S Λ$

Assumptions:

The matrix $A$ you are trying to decompose must:

be a square matrix
have linearly independent eigenvectors (different eigenvalues, 1 for each row of the matrix)

print("Recall A:\n", A)
print("\nAnd as calculated earlier for A:")
print("\nEigenvalues:", np.round(eig_vals, 0))
print("\nEigenvectors:\n", np.round(eig_vecs.T, 2))
print("\n------\n")
 
# Define a change of basis matrix S, and then diagonalise A using Lambda = S^(-1) A S
S = eig_vecs
Lambda = np.linalg.inv(S) @ A @ S
 
print("Lambda = S^-1 @ A @ S:\n", np.round(Lambda, 2))

Recall A:
 [[4 2 2]
 [2 4 2]
 [2 2 4]]
 
And as calculated earlier for A:
 
Eigenvalues: [2. 8. 2.]
 
Eigenvectors:
 [[-0.82  0.41  0.41]
 [ 0.58  0.58  0.58]
 [ 0.51 -0.81  0.3 ]]
 
------
 
Lambda = S^-1 @ A @ S:
 [[ 2.  0. -0.]
 [ 0.  8. -0.]
 [-0. -0.  2.]]

Now that we know $AS=S Λ$ , we can do:

Diagonalisation:

Takes $A$ matrix to produce $Λ$ , a diagonal matrix with eigenvalues on the main diagonal
Multiply both sides by $S^{- 1}$ from the left.

S^{- 1} \times AS S^{- 1} AS S^{- 1} AS Λ = S^{- 1} \times S Λ = S^{- 1} S Λ = Λ = S^{- 1} AS

Eigendecomposition

After diagonalising $A$ to get $Λ$ , we can use eigendecomposition to do quick matrix multiplications of $A$
Multiply both sides by $S^{- 1}$ from the right

AS \times S^{- 1} ASS^{- 1} A = S Λ \times S^{- 1} = S Λ S^{- 1} = S Λ S^{- 1}

$Now note, if we raise A to arbitrary powers$

A^{2} In general A^{k} = (S Λ S^{- 1}) (S Λ S^{- 1}) = S Λ (S^{- 1} S) Λ S^{- 1} = S Λ^{2} S^{- 1} = S Λ^{k} S^{- 1}

# Perform eigendecomposition to recover A from its eigenvectors and eigenvalues
A_eig_decomp = S @ Lambda @ np.linalg.inv(S)
print("A_eigendecomposed = S @ Lambda @ S^-1:\n", A_eig_decomp)

A_eigendecomposed = S @ Lambda @ S^-1:
 [[4. 2. 2.]
 [2. 4. 2.]
 [2. 2. 4.]]

Questions

When exactly do we use decompositions?
What is the intuition behind an eigenvalue and eigenvector?
Some interesting edge cases (what are the eigenvalues and eigenvectors for):
- A rotation-only matrix like $[01 - 1 0]$ has only imaginary $λ = i$ . No eigenvectors, as each vector is rotated
- A shear matrix like $[1011]$ : only 1 eigenvalue ( $λ = 1$ ), so only 1 eigenvector (all vectors on the $x$ -axis are eigenvectors)
- A scaling-only matrix like $[2002]$ : only 1 eigenvalue ( $λ = 2$ ), but all vectors in the plane are eigenvectors, and scaled by a factor of $2$

notes/

Ch 4.2. Eigendecomposition and diagonalisation

Eigenvalue Problem and the Characteristic Polynomial

Why the $det (A - λ I) = 0$ observation matters

Eigenbasis, Diagonalisation, and Eigen-decomposition

Eigenbasis

Diagonalisation: Using the Eigenbasis to Diagonalise any non-diagonal $A$

Derivation: Why is diagonalisation $Λ = S^{- 1} AS$ possible in the first place?

Assumptions:

Now that we know $AS=S Λ$ , we can do:

Diagonalisation:

Eigendecomposition

Questions

Ch 4.2. Eigendecomposition and diagonalisation

Eigenvalue Problem and the Characteristic Polynomial

Why the det(A−λI)=0 observation matters

Eigenbasis, Diagonalisation, and Eigen-decomposition

Eigenbasis

Diagonalisation: Using the Eigenbasis to Diagonalise any non-diagonal A

Derivation: Why is diagonalisation Λ=S−1AS possible in the first place?

Assumptions:

Now that we know AS=SΛ, we can do:

Diagonalisation:

Eigendecomposition

Questions

Graph View

Backlinks

Explorer

Why the $det (A - λ I) = 0$ observation matters

Diagonalisation: Using the Eigenbasis to Diagonalise any non-diagonal $A$

Derivation: Why is diagonalisation $Λ = S^{- 1} AS$ possible in the first place?

Now that we know $AS=S Λ$ , we can do: