CPSC 406 – Computational Optimization
\[ \def\argmin{\operatorname*{argmin}} \def\Ball{\mathbf{B}} \def\bmat#1{\begin{bmatrix}#1\end{bmatrix}} \def\Diag{\mathbf{Diag}} \def\half{\tfrac12} \def\ip#1{\langle #1 \rangle} \def\maxim{\mathop{\hbox{\rm maximize}}} \def\maximize#1{\displaystyle\maxim_{#1}} \def\minim{\mathop{\hbox{\rm minimize}}} \def\minimize#1{\displaystyle\minim_{#1}} \def\norm#1{\|#1\|} \def\Null{{\mathbf{null}}} \def\proj{\mathbf{proj}} \def\R{\mathbb R} \def\Rn{\R^n} \def\rank{\mathbf{rank}} \def\range{{\mathbf{range}}} \def\span{{\mathbf{span}}} \def\st{\hbox{\rm subject to}} \def\T{^\intercal} \def\textt#1{\quad\text{#1}\quad} \def\trace{\mathbf{trace}} \]
\[ \min_{c,s}\ \sum_{i=1}^m (y_i - \bar{y}_i)^2 \quad\text{st}\quad \bar y_i = c + s z_i \]
\[ \min_{c,s}\ \sum_{i=1}^m (y_i - \bar{y}_i)^2 \quad\text{st}\quad \bar y_i = c + s z_i \]
Matrix formulation:
\[ \min_x \|Ax - b\|^2_2 = \sum_{i=1}^m (a_i^T x - b_i)^2 \] where \[ A = \begin{bmatrix} 1 & z_1 \\ \vdots & \vdots \\ 1 & z_m \end{bmatrix},\quad b = \begin{bmatrix} y_1 \\ \vdots \\ y_m \end{bmatrix},\quad x = \begin{bmatrix} c \\ s \end{bmatrix} \]
Given \(m\) measurements \(y_i\) taken at times \(t_i\): \[ (t_1,y_1),\ldots,(t_m,y_m) \]
Polynomial model \(p(t)\) of degree \((n-1)\):
\[p(t) = x_0+x_1t+x_2t^2+\cdots+x_{n-1}t^{n-1} \quad (x_i=\text{coeff's})\]
Find coefficients \(x_0,x_1,\ldots,x_{n-1}\) such that
\[\begin{align} p(t_1) &\approx y_1 \\\vdots \\p(t_m) &\approx y_m \end{align}\]
\(\Longleftrightarrow\)
\[ \underbrace{ \begin{bmatrix} 1 & t_1 & t_1^2 & \cdots & t_1^{n-1} \\\vdots & \vdots & \vdots & \ddots & \vdots \\\ 1 & t_m & t_m^2 & \cdots & t_m^{n-1} \end{bmatrix}}_{A} \underbrace{ \begin{bmatrix} x_0 \\\vdots \\\ x_{n-1} \end{bmatrix} }_{x} \approx \underbrace{ \begin{bmatrix} y_1 \\\vdots \\\ y_m \end{bmatrix} }_{b} \]
Find \(x\) where \(Ax\approx b\)
Suppose that \(A\) is an \(m\times n\) full-rank matrix with \(m>n\). Then
Find \(x\) where \(Ax=b\)
\[x^*=\argmin_x f(x):=\half\|Ax-b\|_2^2=\half\sum_{i=1}^m(a_i^T x - b_i)^2\]
quadratic objective \[ \|r\|_2^2=r^Tr \quad\Longrightarrow\quad f(x) = \half(Ax-b)^T(Ax-b) = \half x^T A^T Ax - b^T Ax + \half b^T b \]
gradient: \(\nabla f(x) = A^TAx-A^Tb\)
the solution of LS must be a stationary point of \(f\): \[ \nabla f(x^*) = 0 \quad\Longleftrightarrow\quad A^TAx^*-A^Tb=0 \quad\Longleftrightarrow\quad \underbrace{A^TAx^*=A^Tb}_{\text{normal equations}} \]
If \(A\) has full column rank \(\quad\Longrightarrow\quad\) \(x^*=(A^TA)^{-1}A^Tb\quad(unique)\)
\[ A = [a_1\ a_2\ \cdots\ a_n] \quad\text{where}\quad a_i\in\mathbb{R}^m \]
\[\begin{align} \range(A)&=\{y\mid y=Ax \text{ for some }\quad x\in\mathbb{R}^n\} \\ \Null(A^T)&=\{z\mid A^Tz=0\} \\ \range(A)^\perp&=\Null(A^T) \end{align}\]
orthogonality of residual \(r=b-Ax\) and columns of \(A\) \[ \left. \begin{align} a_1^T r &= 0 \\ a_2^T r &= 0 \\ \vdots \\ a_n^T r &= 0 \end{align} \right\} \quad\Longleftrightarrow\quad \begin{bmatrix} a_1^T \\ a_2^T \\ \vdots \\ a_n^T \end{bmatrix}r \quad\Longleftrightarrow\quad A^T r = 0 \quad\Longleftrightarrow\quad r\in\Null(A^T) \]
the following conditions are equivalent
projection \(y^*=Ax^* = \proj_{\range(A)}(b)\) is unique
\[ A = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} \quad\text{and}\quad b = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix} \quad (n = 1) \]
If \(m=3\) and \(b=(1, 3, 5)\) what is \(x^*\)?