CPSC 406 β Computational Optimization
\[ \def\argmin{\operatorname*{argmin}} \def\Ball{\mathbf{B}} \def\bmat#1{\begin{bmatrix}#1\end{bmatrix}} \def\Diag{\mathbf{Diag}} \def\half{\tfrac12} \def\ip#1{\langle #1 \rangle} \def\maxim{\mathop{\hbox{\rm maximize}}} \def\maximize#1{\displaystyle\maxim_{#1}} \def\minim{\mathop{\hbox{\rm minimize}}} \def\minimize#1{\displaystyle\minim_{#1}} \def\norm#1{\|#1\|} \def\Null{{\mathbf{null}}} \def\proj{\mathbf{proj}} \def\R{\mathbb R} \def\Rn{\R^n} \def\rank{\mathbf{rank}} \def\range{{\mathbf{range}}} \def\span{{\mathbf{span}}} \def\st{\hbox{\rm subject to}} \def\T{^\intercal} \def\textt#1{\quad\text{#1}\quad} \def\trace{\mathbf{trace}} \]
The singular value decomposition (SVD) reveals many of the most important properties of a matrix. It is a generalization of the eigenvalue decomposition (EVD) to non-square matrices.
12Γ12 Matrix{Int64}:
1 2 3 4 5 6 7 8 9 10 11 12
2 4 6 8 10 12 14 16 18 20 22 24
3 6 9 12 15 18 21 24 27 30 33 36
4 8 12 16 20 24 28 32 36 40 44 48
5 10 15 20 25 30 35 40 45 50 55 60
6 12 18 24 30 36 42 48 54 60 66 72
7 14 21 28 35 42 49 56 63 70 77 84
8 16 24 32 40 48 56 64 72 80 88 96
9 18 27 36 45 54 63 72 81 90 99 108
10 20 30 40 50 60 70 80 90 100 110 120
11 22 33 44 55 66 77 88 99 110 121 132
12 24 36 48 60 72 84 96 108 120 132 144
For any \(m\times n\) matrix \(A\) with rank \(r\) \[ \begin{aligned} A = U\Sigma V^T = [u_1\ |\ u_2\ |\cdots\ |\ u_r] \begin{bmatrix} \sigma_1 & & & \\ & \sigma_2 & & \\ & & \ddots & \\ & & & \sigma_r \\ \end{bmatrix} \begin{bmatrix} v_1\T \\ \hline v_2\T \\ \hline \vdots \\ \hline v_r\T \end{bmatrix} = \sum_{j=1}^r \sigma_j u_j v_j\T \end{aligned} \] left \(U\) and right \(V\) singular vectors are orthonormal and singular values: \[ U\T U = I_r, \qquad V\T V = I_r, \qquad \sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r \geq 0 \]
for \(j=1,\ldots,r\) \[ \begin{aligned} AV &= U\Sigma\\ Av_j &= \sigma_j u_j \end{aligned} \]
\[ \begin{aligned} A\T U &= V\Sigma\\ A\T u_j &= \sigma_j v_j \end{aligned} \]
What is the rank of the Finnish flag?
What is the rank of the Greek flag?
the SVD reveals bases for the four fundamental subspaces of \(A\)
\[ \begin{aligned} \proj_{\range(A\T)} &= VV\T\\ \proj_{\Null(A)} &= I_n-VV\T \end{aligned} \]
\[ \begin{aligned} \proj_{\range(A)} &= UU\T\\ \proj_{\Null(A\T)} &= I_m-UU\T \end{aligned} \]
Conceptually, we can construct the SVD from the Grammian \(A\T A\):
gather the eigenvalues of \(A\T A\) in descending order (may be multiplicity): \[ \lambda_1\geq \lambda_2\geq \cdots \geq \lambda_r > 0, \qquad \Lambda := \mathbf{Diag}(\lambda_1,\ldots,\lambda_r) \]
because \(A\T A\) is symmetric, the spectral theorem ensures itβs diagonalizable \[ A\T A = V\Lambda V\T \]
define singular values: \[ \sigma_i:=\sqrt{\lambda_i}, \qquad \Sigma:= \mathbf{Diag}(\sigma_1,\ldots,\sigma_r) \]
define left singular vectors: \[U:= AV\Sigma^{-1}\]
summary: \[A = U\Sigma V\T\]
Given SVD \(A = U\Sigma V\T\) with singular values \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0\)
\(U\) is a basis for \(\range(A)\)
\(V\) is a basis for \(\range(A\T)\)
spectral norm of \(A\): \[ \|A\|_2 = \max_{\|x\|_2=1} \|Ax\|_2 = \sigma_1 \]
Frobenious norm of \(A\): \[ \|A\|_F := \sqrt{\sum_{i=1}^m\sum_{j=1}^n a^2_{ij}} \equiv \sqrt{\trace(A\T A)}= \sqrt{\sum_{i=1}^r \sigma_i^2} \]
Suppose that \(u\in\R^m\) and \(u\in\R^n\) are unit-norm vectors. Then the outer product \(uv\T\) is an \(m\times n\) matrix with rank 1. What is the spectral norm of \(uv\T\)?
Suppose that \(x\in\R^m\) and \(y\in\R^n\) are nonzero vectors. Then the outer product \(xy\T\) is an \(m\times n\) matrix with rank 1. What is the spectral norm of \(xy\T\)?
The SVD decomposes any matrix \(A\) with rank \(r\) into a sum of rank-1 matrices: \[ \begin{aligned} A = U\Sigma V\T = [u_1\ |\ u_2\ |\cdots\ |\ u_r] \begin{bmatrix} \sigma_1 & & & \\ & \sigma_2 & & \\ & & \ddots & \\ & & & \sigma_r \\ \end{bmatrix} \begin{bmatrix} v_1\T \\ \hline v_2\T \\ \hline \vdots \\ \hline v_r\T \end{bmatrix} % &= [\sigma_1u_1\ |\ \sigma_2u_2\ |\cdots\ |\ \sigma_ru_r] % \begin{bmatrix} % v_1\T \\ \hline v_2\T \\ \hline \vdots \\ \hline v_r\T % \end{bmatrix} \\ &= \sum_{j=1}^r \sigma_j u_j v_j\T \end{aligned} \]
The best rank-\(k\) approximation to \(A\) is given by rank-\(k\) approximation \[ A_k = \sum_{j=1}^k \sigma_j u_j v_j\T \] with error \[ \|A-A_k\|_2 = \sigma_{k+1} \textt{and} \|A-A_k\|_F = \sqrt{\sum_{j=k+1}^r \sigma_j^2} \]
\[ \begin{aligned} U &= [\,\hat U \mid \bar U\, ]\\ \hat U &= [\, u_1,\ldots,u_r, u_{r+1}, \ldots, u_n\, ]\\ \bar U &= [\, u_{n+1}, \ldots, u_m\, ]\\ V &= [\, v_1,\ldots,v_n\, ]\\ \Sigma &= \Diag(\sigma_1,\ldots,\sigma_r,0,\ldots,0) \end{aligned} \]
Provides orthogonal bases for all four fundamental subspaces
\[ \begin{aligned} \range(A)&=\span\{u_1,\ldots,u_r\}\\ \Null(A\T)&=\span\{u_{r+1},\ldots,u_m\}\\ \range(A\T)&=\span\{v_1,\ldots,v_r\}\\ \Null(A)&=\span\{v_{r+1},\ldots,v_n\} \end{aligned} \]
If \(A\) is \(m\times n\) with \(\rank(A)=r<n\), then infinitely many least-squares solutions: \[ \mathcal X = \{x\in\Rn \mid A\T A x = A\T b\} \]
SVD provides the minimum norm solution \(\bar x = \min\{ \|x\| \mid x\in\mathcal X\}\)
If \(A=U\Sigma V\T\) is the SVD of \(A\), then \[ \begin{aligned} \|Ax-b\|^2 &= \|(U\T A V)(V\T x) - U\T b\|^2 \\ &= \|\Sigma y - U\T b\|^2 & (y:= V\T x)\\ &= \sum_{j=1}^r (\sigma_j y_j - \bar b_j)^2 + \sum_{j=r+1}^n \bar b_j^2 & (\bar b_j=u_j\T b)\\ \end{aligned} \] Choose \[ \sigma_j y_j = \begin{cases} \bar b_j/\sigma_j & j=1:r\\ 0 & j=r+1:n \end{cases} \quad\Longrightarrow\quad \bar x = V y = \sum_{j=1}^r \frac{u_j\T b}{\sigma_j} v_j \]