Understanding the singular value decomposition
This work by Jephian Lin is licensed under a Creative Commons Attribution 4.0 International License.
$\newcommand{\trans}{^\top} \newcommand{\adj}{^{\rm adj}} \newcommand{\cof}{^{\rm cof}} \newcommand{\inp}[2]{\left\langle#1,#2\right\rangle} \newcommand{\dunion}{\mathbin{\dot\cup}} \newcommand{\bzero}{\mathbf{0}} \newcommand{\bone}{\mathbf{1}} \newcommand{\ba}{\mathbf{a}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bc}{\mathbf{c}} \newcommand{\bd}{\mathbf{d}} \newcommand{\be}{\mathbf{e}} \newcommand{\bh}{\mathbf{h}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bq}{\mathbf{q}} \newcommand{\br}{\mathbf{r}} \newcommand{\bx}{\mathbf{x}} \newcommand{\by}{\mathbf{y}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bv}{\mathbf{v}} \newcommand{\bw}{\mathbf{w}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\nul}{\operatorname{null}} \newcommand{\rank}{\operatorname{rank}} %\newcommand{\ker}{\operatorname{ker}} \newcommand{\range}{\operatorname{range}} \newcommand{\Col}{\operatorname{Col}} \newcommand{\Row}{\operatorname{Row}} \newcommand{\spec}{\operatorname{spec}} \newcommand{\vspan}{\operatorname{span}} \newcommand{\Vol}{\operatorname{Vol}} \newcommand{\sgn}{\operatorname{sgn}} \newcommand{\idmap}{\operatorname{id}} \newcommand{\am}{\operatorname{am}} \newcommand{\gm}{\operatorname{gm}} \newcommand{\mult}{\operatorname{mult}} \newcommand{\iner}{\operatorname{iner}}$
from lingeo import random_int_list
Let $A$ be an $m\times n$ matrix and
$f_A : \mathbb{R}^n\rightarrow\mathbb{R}^n$ the corresponding linear function defined by $f(\bv) = A\bv$. Let $\mathcal{E}_n$ and $\mathcal{E}_m$ be the standard bases of $\mathbb{R}^n$ and $\mathbb{R}^m$, respectively.
Then $[f_A] = [f_A]_{\mathcal{E}_n}^{\mathcal{E}_m} = A$.
Let $\alpha$ be another basis of $\mathbb{R}^n$ and $V = [\idmap]_\alpha^{\mathcal{E}_n}$.
Let $\beta$ be another basis of $\mathbb{R}^m$ and $U = [\idmap]_\beta^{\mathcal{E}_m}$.
Then $[f_A]_\alpha^\beta = U^{-1}AV$ and $A = U[f_A]_\alpha^\beta V^{-1}$.
Let $A$ be an $m\times n$ matrix.
Then there are orthonormal bases $\alpha$ and $\beta$ of $\mathbb{R}^n$ and $\mathbb{R}^m$, respectively, such that
$$[f_A]_\alpha^\beta = \Sigma = \begin{bmatrix}
\operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\
O_{m-r,r} & O_{m-r,n-r}
\end{bmatrix},$$
where $\sigma_1\geq\cdots\geq\sigma_r$ and $\operatorname{diag}(\sigma_1,\ldots,\sigma_r)$ is the diagonal matrix with the given diagonal entries.
That is, there are $m\times m$ and $n\times n$ orthogonal matrices $U$ and $V$ such that $U\trans AV = \Sigma$ and $A = U\Sigma V\trans$.
Let $\alpha = \{ \bv_1, \ldots, \bv_n \}$ and $\beta = \{ \bu_1, \ldots, \bu_m \}$ be the bases in the singular value decomposition.
Then $V$ is the matrix whose columns are vectors in $\alpha$, and
$U$ is the matrix whose columns are vectors in $\beta$.
Since $\alpha$ and $\beta$ orthogonal orthonormal, $U$ and $V$ are orthogonal and $U^{-1} = U\trans$ and $V^{-1} = V\trans$.
By examining $AV = U\Sigma$, we have
$$AV =
A\begin{bmatrix}
| & ~ & | \\
\bv_1 & \cdots & \bv_n \\
| & ~ & |
\end{bmatrix} =
U\Sigma =
\begin{bmatrix}
| & ~ & | \\
\bv_1 & \cdots & \bv_n \\
| & ~ & |
\end{bmatrix}
\begin{bmatrix}
\operatorname{diag}(\sigma_1, \ldots, \sigma_r) & O_{r,n-r} \\
O_{m-r,r} & O_{m-r,n-r}
\end{bmatrix} =
\begin{bmatrix}
| & ~ & | \\
\sigma_1\bv_1 & \cdots & \sigma_r\bv_r & O_{n,n-r} \\
| & ~ & |
\end{bmatrix}.$$
Therefore, $A\bv_i = \lambda_i \bu_i$ for $i = 1,\ldots, r$ and $A\bv_i = \bzero$ for $i = r+1,\ldots, n$.
The values $\sigma_1,\ldots,\sigma_r$ are called the sigular values of $A$.
Singular values are necessarily positive, and the number of singular values is $r = \rank(A)$.
執行以下程式碼。
令 $f : \mathbb{R}^4 \rightarrow \mathbb{R}^3$ 為一線性函數。
令 $\alpha = \{\bv_1,\ldots, \bv_4\}$ 及 $\beta = \{\bu_1,\ldots, \bu_3\}$ 分別為 $\mathbb{R}^4$ 和 $\mathbb{R}^3$ 的一組單位長垂直基底。
已知 $\Sigma = [f]_\alpha^\beta$。
Run the code below. Let $f : \mathbb{R}^4 \rightarrow \mathbb{R}^3$ be a linear function. Let $\alpha = \{\bv_1,\ldots, \bv_4\}$ and $\beta = \{\bu_1,\ldots, \bu_3\}$ be orthonormal bases of $\mathbb{R}^4$ and $\mathbb{R}^3$, respectively. Suppose $\Sigma = [f]_\alpha^\beta$.
### code
set_random_seed(0)
print_ans = False
r = choice([1,2,3])
sigs = list(map(lambda k: abs(k), random_int_list(r)))
sigs.sort(reverse=True)
m,n = 3,4
Sigma = zero_matrix(m,n)
for i in range(r):
Sigma[i,i] = sigs[i]
print("Sigma =")
show(Sigma)
if print_ans:
for i in range(n):
if i < r:
print("A v%s = %s u%s"%(i+1, sigs[i], i+1))
else:
print("A v%s = 0"%(i+1))
print("range(f_A) = span({ u1, ..., u%s })"%r)
print("ker(f_A) = span({ v%s, ..., vn })"%(r+1))
print("rank(f_A) =", r)
print("null(f_A) =", n-r)
利用 $\Sigma$ 說明 $f_A$ 的作用。
Observe the structure of $\Sigma$ and describe the effect of $f_A$.
找出 $\range(f_A)$、$\ker(f_A)$、$\rank(f_A)$、$\nul(f_A)$。
Find $\range(f_A)$, $\ker(f_A)$, $\rank(f_A)$, and $\nul(f_A)$.
令 $A$ 為一 $3\times 4$ 矩陣、
$\alpha = \{ \bv_1,\ldots,\bv_4 \}$ 為 $\mathbb{R}^4$ 的一組基底、
$\beta = \{ \bu_1,\ldots,\bu_3 \}$ 為 $\mathbb{R}^3$ 的一組基底。
已知
將 $A\bv_1$、$A\bv_2$、$A\bv_3$、及 $A(\bv_1 + \bv_2 + \bv_3)$ 分別寫成 $\beta$ 的線性組合。
Let $A$ be a $3\times 4$ matrix, $\alpha = \{ \bv_1,\ldots,\bv_4 \}$ a basis of $\mathbb{R}^4$, and $\beta = \{ \bu_1,\ldots,\bu_3 \}$ a basis of $\mathbb{R}^3$. Suppose
$$ [f_A]_\beta^\beta = \begin{bmatrix} 3 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & 5 & 0 \\ \end{bmatrix}. $$Write $A\bv_1$, $A\bv_2$, $A\bv_3$, and $A(\bv_1 + \bv_2 + \bv_3)$ as linear combinations of $\beta$.
令
$$ A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ \end{bmatrix}, $$$\beta = \{ \bv_1, \ldots, \bv_4 \}$ 為
$$ \begin{bmatrix} \frac{1}{2} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & 0 & -\frac{2}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & 0 & 0 & -\frac{3}{\sqrt{12}} \\ \end{bmatrix} $$的行向量集合、
且 $\beta = \{ \bu_1, \ldots, \bu_3 \}$ 為
的行向量集合。
Let
$$ A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ \end{bmatrix}, $$$\alpha = \{ \bv_1, \ldots, \bv_4 \}$ the columns of
$$ \begin{bmatrix} \frac{1}{2} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & 0 & -\frac{2}{\sqrt{6}} & \frac{1}{\sqrt{12}} \\ \frac{1}{2} & 0 & 0 & -\frac{3}{\sqrt{12}} \\ \end{bmatrix}, $$and $\beta = \{ \bu_1, \ldots, \bu_3 \}$ the columns of
$$ \begin{bmatrix} \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{3}} & -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{3}} & 0 & -\frac{2}{\sqrt{6}} \\ \end{bmatrix}. $$寫出 $[f_A]_\alpha^\beta$ 並說明 $f_A$ 的作用。
Find $[f_A]_\alpha^\beta$ and describe the effect of $f_A$.
找出兩個垂直矩陣 $U$ 和 $V$
以及一個只有在 $i,i$-項有值的矩陣 $\Sigma$ 使得 $A = U\Sigma V\trans$。
Find orthogonal matrices $U$ and $V$, and a matrix $\Sigma$ whose nonzero entries only occurs on the $i,i$-entries, such that $A = U\Sigma V\trans$.
令 $A$ 為 $m\times n$ 矩陣且 $\rank(A) = r$。
令 $A = U\Sigma V\trans$ 為其奇異值分解﹐其中 $\sigma_1\geq \cdots \geq \sigma_r > 0$ 為其奇異值。
令 $\bu_1,\ldots,\bu_{m}$ 為 $U$ 的各行向量、
$\bv_0, \ldots, \bv_{n-1}$ 為 $V$ 的各行向量。
Let $A$ be an $m\times n$ matrix and $\rank(A) = r$. Let $A = U\Sigma V\trans$ be the singular value decomposition, where $\sigma_1\geq \cdots \geq \sigma_r > 0$ are the singular values. Let $\bu_1,\ldots,\bu_{m}$ be the columns of $U$ and $\bv_0, \ldots, \bv_{n-1}$ the columns of $V$.
說明
$$ A = \sum_{i=1}^{r} \sigma_i\bu_i\bv_i\trans. $$Verify and give intuition to
$$ A = \sum_{i=1}^{r} \sigma_i\bu_i\bv_i\trans. $$定義 $\|M\| = \sqrt{M\trans M}$ 為矩陣的長度。
說明 $\|M\|^2$ 就是 $M$ 的各項平方和。
Define $\|M\| = \sqrt{M\trans M}$ as the norm of a matirx. Verify that $\|M\|^2$ is indeed the square sum of the entries of $M$.
說明對任何 $i = 1,\ldots, r$ 都有 $\|\bu_i\bv_i\| = 1$。
因此我們可以令 $A' = \sum_{i=1}^{k} \sigma_i\bu_i\bv_i\trans$ 當作 $A$ 的逼近﹐而且有 $\|A - A'\| \leq \sum_{k+1}^r \sigma_i$。
(矩陣長度也符合三角不等式。)
Show that $\|\bu_i\bv_i\| = 1$ for each $i = 1,\ldots, r$.
Therefore, we may consider $A' = \sum_{i=1}^{k} \sigma_i\bu_i\bv_i\trans$ as an approximation of $A$ with $\|A - A'\| \leq \sum_{k+1}^r \sigma_i$. (Note that the norm of matrices also satisfies the triangle inequality.)
每張圖片的亮度都可視為一個矩陣。
令 $A$ 為原圖﹐則我們可以用 $A'$ 來降低原圖的大小。
執行以下程式碼。
試試看選用各個 $k$ 的差異。
算一下把 $A'$ 儲存起來要幾個數字。
Each picture can be viewed as a matrix, where each entry records the brightness of each pixel. Let $A$ be the original picture. Then $A'$ is the compressed data. Run the code below. Try to apply different $k$. Then calculate the number of entries required to store $A'$.
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg as LA
arr = plt.imread('incrediville-side.jpg').mean(axis=-1)
plt.imshow(arr, cmap='Greys_r', vmin=0, vmax=255)
U,s,Vh = LA.svd(arr) ### might take long
k = 50 ### between 1 to 3120
arr_new = U[:,:k].dot(np.diag(s[:k])).dot(Vh[:k,:])
plt.imshow(arr_new, cmap='Greys_r', vmin=0, vmax=255)