Homework 2

Least-squares, regularization, QR, and SVD

Homework submission instructions

P1. Quadratic fit to data

Consider the NO₂ trendline example from the class notes.

  • Compute the coefficients \(\alpha_1,\ldots\alpha_3\) that fit the quadratic model \[ q(t) = \alpha_1 + \alpha_2 t + \alpha_3 t^2 \] to the detrended data (see the last plot of that example) in the least-squares sense.
  • Provide a plot of the detrended data and overlay the fitted function \(q(t)\).

P2. Iterative regularization

Infrastructure code
using LinearAlgebra, FFTW, Plots, Random

# Generate a signal using 4 random elements from the DCT basis.
function signal(n=200, k=4, ϵ=0.1)
    Random.seed!(19)
    x = zeros(n)
    p = randperm(n)[1:k]
    x[p] .= rand(k)
    noise = ϵ*randn(n)
    s = FFTW.dct(x) # true signal
    b = s + noise   # noisy signal
    return s, b
end

# Create an n-dimensional explicit Discrete Cosine Transform matrix.
function DCT(n::Integer)
    D = zeros(n, n)
    for i in 1:n
        e = zeros(n); e[i] = 1
        D[:,i] = FFTW.dct(e)
    end
    return D
end;

You observe the following noisy 1-D signal, which is stored in the vector \(b\). The true (and “unknown”) signal \(s\) is displayed in blue. (This is a contrived example — in practice, we would not know the true signal.)

s, b = signal()
plot([s b], w = [2 1], label=["true signal `s`" "observed noisy signal `b`"])

P2(a) Least-squares solution

Obtain a guess of the underlying signal \(s\) by solving a linear least-squares problem with the DCT matrix \(D\) as the basis. Plot the recovered signal \(\bar s\) on top of the true signal \(s^\natural\). How well does the least-squares solution recover the true signal?

P2(b) Iterative regularization

Now suppose that we have knowledge that the signal \(s^\natural\) (i.e., the signal without noise) is very uniform, and is in fact the superposition of a few sinusoidal-like elements, such as are contained in the columns of the Discrete Cosine Transform. Here are the first five columns of the DCT matrix:

D = DCT(length(b))
plot(D[:, 1:5], xlims=(2, length(b)), leg=false)

The first five columns of the DCT matrix

The question, then, is how to choose the right columns of \(D\) in order to obtain a good guess of the underlying signal? In other words, is there a sparse vector \(x\) (i.e., one with just a few nonzeros) such that \[ Dx \approx s? \] Equivalently, are there a few columns of \(D\) whose span includes the true signal \(s\)?

We’ll try to solve this problem by building up a solution \(x\) element by element. We do this by solving a sequence of least-squares problems of increasing column dimension. The procedure begins with a trivial solution \(x_0 = 0\), associated residual \(r_0 = b\), and an empty set of columns \(S_0 = [\ ]\). At each iteration \(k>1\) of the procedure, we do the following:

  1. Find the column of \(D\) that is most correlated with the current residual \(r_k\), i.e., find the index \(j_0\) of the column \(s_k := D[:, j_k]\) that maximizes \[ \max_j\ |s_k^T r_k|. \]
  2. Stop if the correlation is small, i.e., if \(|s_k^T r_k| < \tau\) for some threshold \(\tau\).
  3. Append that column to the current set of columns to obtain \(S_{k} = [\, \text{new col} \mid S_{k-1}\,]\).
  4. Obtain the next coefficient approximation \(x_{k+1}\) and corresponding residual \(r_{k+1}\) as the solution of the least-squares problem \[ x_{k+1} := \arg\min_x\ \tfrac12\|S_k x - b\|^2. \]

Implement this procedure and plot the recovered signal \(\hat s\) overlayed on the true signal \(s^\natural\). How well does the iterative regularization procedure recover the true signal? How many iterations are required to obtain a good approximation?

P3. Underdetermined systems

Show how to use the QR factorization to solve a full-rank underdetermined system \(Ax=b\) (i.e., \(A\) is \(m\times n\) with \(m<n\) and \(\mathrm{rank}(A)=m\)). Give an explicit procedure for solving this system using the QR factors. Your procedure must only use matrix-vector products and triangular solves. (Hint: consider using the thin QR factorization of \(A\T\).)

Demonstrate that your method works on a random \(2\times 3\) system. These functions may be useful:

A = randn(m, n) # Gaussian random matrix
x = randn(n)    # and vector

P4. (Beck 3.1) Tikhonov regularization

Fix the matrices \(A\in\R^{m\times n}\) and \(L\in\R^{n\times n}\). Consider the regularized least-squares problem \[ \min_x\ \|Ax-b\|^2 + \|Lx\|^2. \] Show that this problem has a unique solution if and only if \(\Null(A)\cap\Null(L) = \{0\}\).

P5. Singular values (Reichel)

Define the matrix \(A\) with entries \[ a_{ij} = \sqrt{1-x_i^2 - y_j^2}, \quad x_i = y_i = -0.7+\Delta(i-1),\ \Delta=10^{-3} \] where \(i,j=1,2,\ldots,1401\).

  1. Plot the matrix \(A\) in 3D.
  2. Plot the first 50 singular values of \(A\).
  3. Plot the error matrix \(|A - A_k|\) (interpreted componentwise) where \(A_k\) is the best rank-\(k\) approximation to \(A\). Choose \(k\) to be the index of the largest singular value below \(10^{-10}\). You may try a surface or contour plot.