QR Factorization
CPSC 406 – Computational Optimization
Overview
\[
\def\argmin{\operatorname*{argmin}}
\def\Ball{\mathbf{B}}
\def\bmat#1{\begin{bmatrix}#1\end{bmatrix}}
\def\Diag{\mathbf{Diag}}
\def\half{\tfrac12}
\def\ip#1{\langle #1 \rangle}
\def\maxim{\mathop{\hbox{\rm maximize}}}
\def\maximize#1{\displaystyle\maxim_{#1}}
\def\minim{\mathop{\hbox{\rm minimize}}}
\def\minimize#1{\displaystyle\minim_{#1}}
\def\norm#1{\|#1\|}
\def\Null{{\mathbf{null}}}
\def\proj{\mathbf{proj}}
\def\R{\mathbb R}
\def\Rn{\R^n}
\def\rank{\mathbf{rank}}
\def\range{{\mathbf{range}}}
\def\span{{\mathbf{span}}}
\def\st{\hbox{\rm subject to}}
\def\T{^\intercal}
\def\textt#1{\quad\text{#1}\quad}
\def\trace{\mathbf{trace}}
\]
- Orthogonality
- QR properties
- Solution of linear least-squares problems
Orthogonal vectors
Two vectors \(x\) and \(y\) in \(\Rn\)
recall cosine identity \[
x\T y = \|x\|_2\|y\|_2\cos\theta
\]
\(x\) and \(y\) in \(\mathbb{R}^n\) are orthogonal if \[
x\T y = 0 \quad (\cos\theta = 0)
\]
\(x\) and \(y\) are orthonormal if \[
x\T y = 0, \quad x\T x = 1, \quad y\T y = 1
\]
a set of orthogonal \(n\)-vectors \(\{q_1,\ldots,q_m\}\) are linearly independent
- if \(m=n\) then it’s a basis for \(\Rn\)
Orthogonal matrices
An \(n\times r\) matrix \(Q\) is orthonormal if its columns are pairwise orthonormal: \[
Q = [q_1\ | \cdots |\ q_r],
\qquad
Q\T Q = \begin{bmatrix}
q_1\T q_1 & q_2\T q_1 & \cdots & q_r\T q_1 \\
q_1\T q_2 & q_2\T q_2 & \cdots & q_r\T q_2 \\
\vdots & \vdots & \ddots & \vdots \\
q_1\T q_r & q_2\T q_r & \cdots & q_r\T q_r
\end{bmatrix}
= I_r
\]
if \(r=n\) (ie, \(Q\) is square) then \(Q\) is orthogonal \[
Q^{-1} = Q\T \quad\text{and}\quad Q\T Q = QQ\T = I_n
\]
orthogonal transformations preserve lengths and angles \[
\|x\|_2 = \|Qx\|_2
\textt{and}
x\T y = x\T Q\T Qy = (Qx)\T(Qy)
\textt{and}
\det(Q) = \pm 1
\]
QR Factorization
where
- \(Q\) is orthogonal (\(Q\T Q = Q Q\T = I_m\))
- \(\hat R\) is upper triangular (\(\hat R_{ij} = 0\) for \(i > j\))
- \(\range(\hat{Q})=\range(A)\)
- \(\range(\bar{Q})=\range(A)^\perp \equiv \Null(A\T)\)
using LinearAlgebra
Q, R = qr(A)
Reduced QR Factorization
For \(A\) \(m\times n\) with \(m\geq n\), full rank
taking column by column, with \(\hat Q=[q_1\ |\ \cdots\ |\ q_n]\) \[
\begin{align*}
a_1 &= r_{11}q_1 \\
a_2 &= r_{12}q_1 + r_{22}q_2 \\
a_3 &= r_{13}q_1 + r_{23}q_2 + r_{33}q_3 \\
&\vdots \\
a_n &= r_{1n}q_1 + r_{2n}q_2 + \cdots + r_{nn}q_n
\end{align*}
\]
Nonsingular equations with QR
Given \(n\times n\) matrix \(A\), full rank, solve \[
A x = b
\] solve by QR factorization \(A = QR\) and \(Q\T Q = I\)
mathematically
\[
x = A^{-1}b = (QR)^{-1}b = R^{-1}Q^{-1}b = R^{-1}Q\T b
\]
computationally
using LinearAlgebra
Q, R = qr(A) # O(n^3) <-- dominant cost
y = Q'b # O(n^2) <-- matrix-vector multiply
x = R \ y # O(n^2) <-- triangular solve
Solving Least-Squares via QR
\[
\min_{x\in\Rn}\ \|Ax-b\|^2, \qquad A = QR
\]
\[
\begin{align}
\|Ax-b\|^2 &= (Ax-b)\T(Ax-b) \\
&= (Ax-b)\T Q Q\T(Ax-b) \\
&= \|Q\T(Ax-b)\|^2 \\
&= \left\|\begin{bmatrix} \hat R \\ 0 \end{bmatrix}x -
\begin{bmatrix} \hat Q\T \\ \bar Q\T \end{bmatrix}b\right\|^2 \\
&= \underbrace{\color{DarkRed}{\|\hat R x - \hat Q\T b\|^2}}_{\small (1)}
+ \underbrace{\color{Green}{\|\bar Q\T b\|^2}}_{\small (2)}
\end{align}
\]
where (1) is minimized when \(\hat Rx=\hat Q\T b\) and (2) is constant
Solving Least-Squares via QR
\[
\min_{x\in\Rn}\ \|Ax-b\|^2, \qquad A = QR
\]
mathematically
\[
\begin{align}
A\T A x &= A\T b \\
R\T Q\T Q R x &= R\T Q\T b \\
Rx &= Q\T b\\
x &= R^{-1}Q\T b
\end{align}
\]
computationally
using LinearAlgebra
F = qr(A) # O(n^3) <-- dominant cost
Q, R = Matrix(F.Q), F.R # extract _thin_ Q, and R
y = Q'b # O(n^2) <-- matrix-vector multiply
x = R \ y # O(n^2) <-- triangular solve
more numerically stable than solving \(A\T Ax = A\T b\) directly
Accuracy of QR vs Normal Equations
For \(\epsilon\) positive, this matrix has full rank because \(\sin^2(\theta)+\cos^2(\theta)=1\) \[ A =
\begin{bmatrix}
sin^2(\theta_1) & cos^2(\theta_1+\epsilon) & 1 \\
sin^2(\theta_2) & cos^2(\theta_2+\epsilon) & 1 \\
\vdots & \vdots & \vdots \\
sin^2(\theta_m) & cos^2(\theta_m+\epsilon) & 1 \\
\end{bmatrix}
\]
using LinearAlgebra
θ = LinRange(0,3,400)
ε = 1e-7
A = @. [sin(θ)^2 cos(θ+ε)^2 θ^0]
xᵉ = [1., 2., 1.]
b = A*xᵉ
xn = A'A \ A'b # Compute xn via normal equations
Q, R = qr(A); Q = Matrix(Q) # Compute xr via QR
xr = R \ (Q'b)
xb = A \ b # Compute xb via backslash
@show xn xr xb;
xn = [0.9124057128050078, 1.9124057125688128, 1.0875942872529518]
xr = [0.999999999633748, 1.999999999633748, 1.0000000003662528]
xb = [0.9999999998426099, 1.99999999984261, 1.0000000001573903]