將線性函數化為矩陣¶

Linear function as a matrix

Creative Commons License
This work by Jephian Lin is licensed under a Creative Commons Attribution 4.0 International License.

$\newcommand{\trans}{^\top} \newcommand{\adj}{^{\rm adj}} \newcommand{\cof}{^{\rm cof}} \newcommand{\inp}[2]{\left\langle#1,#2\right\rangle} \newcommand{\dunion}{\mathbin{\dot\cup}} \newcommand{\bzero}{\mathbf{0}} \newcommand{\bone}{\mathbf{1}} \newcommand{\ba}{\mathbf{a}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bc}{\mathbf{c}} \newcommand{\bd}{\mathbf{d}} \newcommand{\be}{\mathbf{e}} \newcommand{\bh}{\mathbf{h}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bq}{\mathbf{q}} \newcommand{\br}{\mathbf{r}} \newcommand{\bx}{\mathbf{x}} \newcommand{\by}{\mathbf{y}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bv}{\mathbf{v}} \newcommand{\bw}{\mathbf{w}} \newcommand{\tr}{\operatorname{tr}} \newcommand{\nul}{\operatorname{null}} \newcommand{\rank}{\operatorname{rank}} %\newcommand{\ker}{\operatorname{ker}} \newcommand{\range}{\operatorname{range}} \newcommand{\Col}{\operatorname{Col}} \newcommand{\Row}{\operatorname{Row}} \newcommand{\spec}{\operatorname{spec}} \newcommand{\vspan}{\operatorname{span}} \newcommand{\Vol}{\operatorname{Vol}} \newcommand{\sgn}{\operatorname{sgn}} \newcommand{\idmap}{\operatorname{id}} \newcommand{\am}{\operatorname{am}} \newcommand{\gm}{\operatorname{gm}} \newcommand{\mult}{\operatorname{mult}} \newcommand{\iner}{\operatorname{iner}}$

In [ ]:

from lingeo import random_int_list

Main idea¶

Let $A$ be an $m\times n$ matrix, $\mathcal{E}_n = \{ \be_1, \ldots, \be_n \}$ the standard basis of $\mathbb{R}^n$, and $\bu_1, \ldots, \bu_n$ the columns of $A$.
Recall that $f_A$ is the unique linear function that satisfies the following conditions.
$$\begin{array}{rcl} f : \mathbb{R}^n & \rightarrow & \mathbb{R}^m \\ \be_1 & \mapsto & \bu_1 \\ ~ & \cdots & ~ \\ \be_n & \mapsto & \bu_n \\ \end{array}$$

In fact, every linear function $f$ from $\mathbb{R}^n$ to $\mathbb{R}^m$ has an $m\times n$ matrix $A$ such that $f(\bv) = A\bv$ for all $\bv\in\mathbb{R}^n$.
Let $f$ be a linear function from $\mathbb{R}^n$ to $\mathbb{R}^m$ and
$\mathcal{E}_n = \{ \be_1, \ldots, \be_n \}$ the standard basis of $\mathbb{R}^n$.
Calculate $\bu_1 = f(\be_1)$, $\ldots$, $\bu_n = f(\be_n)$ and
construct a matrix $A$ whose columns are $\bu_1, \ldots, \bu_n$.
Thus, $f(\bv) = A\bv$ for all $\bv\in\mathbb{R}^n$, and we call $A$ the matrix representation of $f$, denoted as $A = [f]$.

Dimension theorem ($\mathbb{R}^n$ to $\mathbb{R}^m$)¶

Let $f$ be a linear function from $\mathbb{R}^n$ to $\mathbb{R}^m$.
Then $\rank(f) + \nul(f) = n$.

As a consequence, for a linear function from $\mathbb{R}^n$ to $\mathbb{R}^m$, the following are equivalent.

$f$ is injective.
$\nul(f) = 0$.
$\rank(f) = n$.

Side stories¶

total derivative

Experiments¶

Exercise 1¶

執行以下程式碼。
己知 $f$ 是從 $\mathbb{R}^4$ 到 $\mathbb{R}^3$ 的一個函數。

Run the code below. Suppose $f$ is a function from $\mathbb{R}^4$ to $\mathbb{R}^3$.

In [ ]:

### code
set_random_seed(0)
print_ans = False
m,n = 3,4
A = matrix(m, random_int_list(m*n))
f = lambda v: A * v

if print_ans:
    print("f(0) = 0?", True)
    print("f(v1 + v2) = f(v1) + f(v2)?", True)
    print("f(k * v) = k * f(v)?", True)
    print("A =")
    show(A)

Exercise 1(a)¶

驗證是否 $f(\bzero) = \bzero$。
注意這裡兩個零向量分別是定義域和對應域上的零向量。

Verify if $f(\bzero) = \bzero$. Notice that the two zero vectors in the formula are the zero vectors in the domain and the codomain.

In [ ]:

zero4 = vector([0,0,0,0])
f(zero4)

Exercise 1(b)¶

輸入任意的 $\bv_1, \bv_2\in\mathbb{R}^4$。
驗證明是否 $f(\bv_1 + \bv_2) = f(\bv_1) + f(\bv_2)$。

Input any $\bv_1, \bv_2\in\mathbb{R}^4$. Then verify if $f(\bv_1 + \bv_2) = f(\bv_1) + f(\bv_2)$.

In [ ]:

v1 = vector([1,2,3,4])
v2 = vector([1,1,1,1])
print(f(v1 + v2))
print("%s + %s ="%(f(v1), f(v2)), f(v1) + f(v2))

Exercise 1(c)¶

輸入任意的 $k\in\mathbb{R}$ 及 $\bv\in\mathbb{R}^4$。
驗證明是否 $f(k\bv) = kf(\bv)$。

Input any $k\in\mathbb{R}$ and $\bv\in\mathbb{R}^4$. Then verify if $f(k\bv) = kf(\bv)$.

In [ ]:

k = 3
v = vector([1,1,1,1])
print(f(k * v))
print("%s * %s ="%(k, f(v)), k*f(v))

Exercise 1(d)¶

找到一個矩陣 $A$ 使得對於所有 $\bv\in\mathbb{R}^4$ 都有 $f(\bv) = A\bv$。

Find a matrix $A$ such that $f(\bv) = A\bv$ for any $\bv\in\mathbb{R}^4$.

Exercises¶

Exercise 2¶

考慮以下函數 $f$﹐求出矩陣 $A$ 使得 $f = f_A$。

Consider each of the following function $f$. Find a matrix $A$ such that $f = f_A$.

Exercise 2(a)¶

$$ f(x,y,z) = (x,y,0). $$

Exercise 2(b)¶

$$ f(x,y,z) = (3x,4y,5z). $$

Exercise 2(c)¶

$$ f(x,y,z) = (x+2y+3z,4x+5y+6z,7x+8y+9z). $$

Exercise 2(d)¶

$$ f(x,y,z) = (y,z,x). $$

Exercise 2(e)¶

函數 $f$ 把每個 $\mathbb{R}^3$ 中的向量投影到 $(1,1,1)$ 的方向上。

The function $f: \mathbb{R}^3 \rightarrow \mathbb{R}^3$ that project every vector onto the line of $(1,1,1)$.

Exercise 2(f)¶

函數 $f$ 把每個 $\mathbb{R}^3$ 中的向量沿著 $z$ 軸逆時鐘旋轉 $45^\circ$。
（這裡的旋轉是以北極往南看的逆時鐘。）

The function $f: \mathbb{R}^3 \rightarrow \mathbb{R}^3$ that rotates every vector by $45^\circ$ along the $z$-axis. Here the rotation is counterclockwise, from the point of view from the north pole seeing the origin.

Exercise 3¶

令 $f$ 是一個 $\mathbb{R}^n$ 到 $\mathbb{R}^m$ 的可微分函數（不一定線性）﹐
則 $f$ 可以寫成

$$ f(x_1,\ldots, x_n) = (f_1(x_1,\ldots, x_n), \ldots, f_m(x_1,\ldots, x_n)). $$

而 $f$ 的 全微分 為

$$ \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \cdots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n} \\ \end{bmatrix}. $$

Let $F$ be a differentiable function from $\mathbb{R}^n$ to $\mathbb{R}^m$, which is not necessarily linear. Then $f$ can be written as

$$ f(x_1,\ldots, x_n) = (f_1(x_1,\ldots, x_n), \ldots, f_m(x_1,\ldots, x_n)). $$

Then the total derivative of $f$ is

$$ \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \cdots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n} \\ \end{bmatrix}. $$

Exercise 3(a)¶

微分的用意是希望函數的區部性質非常接近線性函數。
說明為什麼全微分會被定為一個 $m\times n$ 矩陣而不是一個 $n\times m$ 矩陣。

The goal of taking the derivative is to approximate a function by a linear function. Give some intuition of why the total derivative of a function from $\mathbb{R}^n$ to $\mathbb{R}^m$ has to be an $m\times n$ matrix, but not an $n\times m$ matrix.

Exercise 3(b)¶

令 $A$ 為一 $m\times n$ 矩陣而 $\bb\in\mathbb{R}^m$。
定義 $f(\bx) = A\bx + \bb$。
求 $f$ 的全微分。

Let $A$ be an $m\times n$ matrix and $\bb\in\mathbb{R}^m$. Define $f(\bx) = A\bx + \bb$. Find the total derivative of $f$.