凱力–漢米頓定理¶

Cayley–Hamilton theorem

Creative Commons License
This work by Jephian Lin is licensed under a Creative Commons Attribution 4.0 International License.

$\newcommand{\trans}{^\top} \newcommand{\adj}{^{\rm adj}} \newcommand{\cof}{^{\rm cof}} \newcommand{\inp}[2]{\left\langle#1,#2\right\rangle} \newcommand{\dunion}{\mathbin{\dot\cup}} \newcommand{\bzero}{\mathbf{0}} \newcommand{\bone}{\mathbf{1}} \newcommand{\ba}{\mathbf{a}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bc}{\mathbf{c}} \newcommand{\bd}{\mathbf{d}} \newcommand{\be}{\mathbf{e}} \newcommand{\bh}{\mathbf{h}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bq}{\mathbf{q}} \newcommand{\br}{\mathbf{r}} \newcommand{\bx}{\mathbf{x}} \newcommand{\by}{\mathbf{y}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bv}{\mathbf{v}} \newcommand{\bw}{\mathbf{w}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\nul}{\operatorname{null}} \newcommand{\rank}{\operatorname{rank}} %\newcommand{\ker}{\operatorname{ker}} \newcommand{\range}{\operatorname{range}} \newcommand{\Col}{\operatorname{Col}} \newcommand{\Row}{\operatorname{Row}} \newcommand{\spec}{\operatorname{spec}} \newcommand{\vspan}{\operatorname{span}} \newcommand{\Vol}{\operatorname{Vol}} \newcommand{\sgn}{\operatorname{sgn}} \newcommand{\idmap}{\operatorname{id}} \newcommand{\am}{\operatorname{am}} \newcommand{\gm}{\operatorname{gm}} \newcommand{\mult}{\operatorname{mult}} \newcommand{\iner}{\operatorname{iner}}$

In [ ]:
from lingeo import random_int_list
from linspace import vtop

Main idea¶

Let $A$ be an $n\times n$ matrix and

$$ p(x) = a_0x^d + a_1x^{d-1} + \cdots + a_d $$

a polynomial.
Then define

$$ p(A) = a_0A^d + a_1A^{d-1} + \cdots + a_dI. $$
Cayley–Hamilton Theorem¶

Let $A$ be an $n\times n$ matrix and

$$ p_A(x) = s_0(-x)^n + s_1(-x)^{n-1} + \cdots + s_n $$

its characteristic polyonimal.
Then

$$ p_A(A) = s_0(-A)^n + s_1(-A)^{n-1} + \cdots + s_nI = O. $$
Remark¶

Although $p_A(x) = \det(A - xI)$, it is not correct that $p_A(A) = \det(A - A\cdot I)$ by definition.
In particular, $p_A(A)$ is a matrix, while $\det(A - AI)$ is a number.

Proof of the Cayley–Hamilton Theorem¶

Let $A_x = A - xI$.
Then we know $\det(A_x) = p_A(x)$ and the adjucate $A_x\adj$ of $A_x$ has the property

$$ \begin{aligned} A_x\adj A_x &= p_A(x) I \\ &= s_0I(-x)^n + s_1I(-x)^{n-1} + \cdots + s_nI. \end{aligned} $$

On the other hand, each entry of $A_x\adj$ is a polynomial of degree at most $n-1$, so we may write

$$ A_x\adj = C_1(-x)^{n-1} + C_2(-x)^{n-2} + \cdots + C_n $$

for some constant matrices $C_1,\ldots, C_n$.
Therefore,

$$ \begin{aligned} A_x\adj A_x &= A_x\adj(A - xI) \\ &= C_1(-x)^n + (C_1A + C_2)(-x)^{n-1} + \cdots + (C_{n-1}A + C_n)(-x) + C_nA. \end{aligned} $$

By comparing the coefficients of the two expressions of $A_x\adj A_x$ we get

$$ \begin{array}{rclcl} s_0 I &= & & &C_1 \\ s_1 I &= &C_1A &+ &C_2\\ \vdots &&&& \\ s_{n-1}I &= &C_{n-1}A &+ &C_n\\ s_nI &= &C_nA. & & \end{array} $$

By post-multiplying $(-A)^n-k$ to both sides of the $k$-th equation for $k = 0,\ldots, n$, we get

$$ \begin{aligned} p_A(A) &= s_0(-A)^n + s_1(-A)^{n-1} + \cdots + s_nI \\ &= C_1(-A)^n + C_1A(-A)^{n-1} + C_2(-A)^{n-1} + C_2A(-A)^{n-2} + \cdots + C_n(-A) + C_nA\\ &= O. \end{aligned} $$

Notice that $s_0,\ldots,s_n$ can be written as polynomials in entries of $A$.
Also, each entry of $A^0,\ldots,A^n$ can be written as a polynomial in entries of $A$.
This mean each entry of $p_A(A)$ is a polynomial in entries of $A$, and it is zero polynomial!

How come these terms cancel with each other can be viewed from graph theory, but it is beyond the scope.
See the book A Combinatorial Approach to Matrix Theory and Its Applications by Richard A. Brualdi and Dragoš Cvetković for more details.

Side stories¶

  • matrix as a complex number

Experiments¶

Exercise 1¶

執行以下程式碼。

Run the code below.

In [ ]:
### code
set_random_seed(0)
print_ans = False

n = 2
A = matrix(n, random_int_list(n^2))
cs = random_int_list(n + 1)
p = vtop(cs)

print("n =", n)
pretty_print(LatexExpr("A ="), A)
pretty_print(LatexExpr("p ="), p)

if print_ans:
    for i in range(n + 1):
        pretty_print(LatexExpr("A^{%s} ="%i), A^i)
    pofA = sum([cs[i] * A^i for i in range(n + 1)], identity_matrix(n))
    pretty_print(LatexExpr("p(A) ="), pofA)
Exercise 1(a)¶

計算 $A^0, A^1, \ldots, A^n$。

Find $A^0, A^1, \ldots, A^n$.

Exercise 1(b)¶

計算 $p(A)$。

Find $p(A)$.

Exercises¶

Exercise 2¶

令

$$ p(x) = x^2 - 5x + 6 = (x - 2)(x - 3). $$

說明

$$ p(A) = A^2 - 5A + 6 = (A - 2I)(A - 3I). $$

因此 $p(A)$ 不會因為 $p$ 的表達式不同而影響結果。

Let

$$ p(x) = x^2 - 5x + 6 = (x - 2)(x - 3). $$

Verify

$$ p(A) = A^2 - 5A + 6 = (A - 2I)(A - 3I). $$

and explain why $p(A)$ does not depend on the different expressions of $p(A)$.

Exercise 3¶

對以下 $n\times n$ 矩陣 $A$,
求出其特徵多項式 $p_A(x)$ 並計算 $A^0, \ldots, A^n$,
最後驗證 $p_A(A) = O$。

For the following matrices $A$, find the characteristic polynomial $p_A(x)$ and calculate $A^0, \ldots, A^n$, and then verify $p_A(A) = O$.

Exercise 3(a)¶
$$ A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}. $$
Exercise 3(b)¶
$$ A = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}. $$
Exercise 4¶

令

$$ A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{bmatrix}. $$

計算 $B_1 = A - I$、$B_2 = A - 2I$、及 $B_3 = A - 3I$。
說明 $B_1,B_2,B_3$ 彼此兩兩可交換,並驗證 $B_1B_2B_3 = O$。
求出特徵多項式 $p_A(x)$ 並說明為什麼 $p_A(A) = O$。

Let

$$ A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{bmatrix}. $$

Calculate $B_1 = A - I$, $B_2 = A - 2I$, and $B_3 = A - 3I$. Explain why $B_1, B_2, B_3$ are mutually commutative and verify $B_1B_2B_3 = O$. Find the characteristic polynomial $p_A(x)$ and explain why $p_A(A) = O$.

Exercise 5¶

令

$$ A = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}. $$

計算 $A^0, A^1, \ldots, A^5$。
求出特徵多項式 $p_A(x)$ 並說明為什麼 $p_A(A) = O$。

Let

$$ A = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}. $$

Calculate $A^0, A^1, \ldots, A^5$. Then find the characteristic polynomial $p_A(x)$ and explain why $p_A(A) = O$.

Exercise 6¶

令

$$ A = \begin{bmatrix} 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 2 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 \end{bmatrix} $$

計算 $B_1 = (A - I)^2$、$B_2 = (A - 2I)^3$、及 $B_3 = (A - 3I)^4$。
說明 $B_1,B_2,B_3$ 彼此兩兩可交換,並驗證 $B_1B_2B_3 = O$。
求出特徵多項式 $p_A(x)$ 並說明為什麼 $p_A(A) = O$。

Let

$$ A = \begin{bmatrix} 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 2 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 2 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 \end{bmatrix} $$

Calculate $B_1 = A - I$, $B_2 = A - 2I$, and $B_3 = A - 3I$. Explain why $B_1, B_2, B_3$ are mutually commutative and verify $B_1B_2B_3 = O$. Find the characteristic polynomial $p_A(x)$ and explain why $p_A(A) = O$.

Exercise 7¶

令

$$ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$

而 $A_x = A - xI$。
計算 $A_x\adj$ 並將其寫成

$$ A_x\adj = C_1(-x)^{2} + C_2(-x)^{1} + C_3. $$

Let

$$ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$

and $A_x = A - xI$. Compute $A_x\adj$ and write it into the form

$$ A_x\adj = C_1(-x)^{2} + C_2(-x)^{1} + C_3. $$
Exercise 8¶

當 $z = a + bi$ 為一複數時,定義

$$ A(z) = \begin{bmatrix} a & -b \\ b & a \end{bmatrix}. $$

驗證對於任兩複數 $z_1,z_2$ 都有

  • $A(z_1z_2) = A(z_1)A(z_2)$ 及
  • $A(z_1 + z_2) = A(z_1) + A(z_2)$。

所以當 $p$ 是一個多項式滿足 $p(z) = 0$ 時,則有 $p(A(z)) = O$。

For any complex number $z = a + bi$, define

$$ A(z) = \begin{bmatrix} a & -b \\ b & a \end{bmatrix}. $$

Verify that

  • $A(z_1z_2) = A(z_1)A(z_2)$ and
  • $A(z_1 + z_2) = A(z_1) + A(z_2)$

for any two complex numbers $z_1,z_2$. Therefore, if $p$ is a polynomial with $p(z) = 0$, then we also have $p(A(z)) = O$.