This example uses the following packages:
using LinearAlgebra
using Plots
The standard least-squares problem can be interpreted as a method for recovering some underlying object (say, a signal or image) encoded in the vector using noisy measurements of the form
where is random noise and the matrix describes the measurement process. The least-squares problem
seeks a solution under the implicit assumption that the noise term is small.
But what if also needs to satisfy some other competing objective? Suppose, for example, that we have also available other set of measurements of of the form , where is also random noise. The "best" choice for is not necessarily (1), and instead we wish to choose to balance the two objective values
Generally, we can make or small, but not both. The figure below sketches the relationship between the pair for all values of .
The objective pairs on the boundary of two regions is the Pareto frontier. We can compute these Pareto optimal solutions by minimizing the weighted sum objective
where the positive parameter provides the relative weight between the objectives.
For fixed scalars and , the set
forms the graph of a line with slope of . We may visualize the optimization problem (2) as the problem of finding the smallest value such that the line is tangent to the Pareto frontier.
A particularly common example of regularized least-squares is Tikhonov, which has the form
for which the objective can be expressed as
If has full column rank, then the stacked matrix
necessarily also has full rank for any positive , which implies that the regularized problem always has a well-defined unique solution.
This example uses the following packages:
using LinearAlgebra
using Plots
Consider a noisy measurement
of a signal , where the vector represents unknown noise. Here's a simple 1-dimensional noisy signal:
n = 300
t = LinRange(0, 4, n)
x = @. sin(t) + t*cos(t)^2
w = 0.1*randn(n)
b = x + w
plot(b, leg=:topleft, label="noisy signal")
The obvious least-squares problem
isn't useful because the optimal solution is simply the noisy measurements , and so it doesn't yield any new information. But suppose we believe that the signal is "smooth" in the sense that the consecutive elements of
change relatively little, i.e., the difference is small relative to . In this case, we might balance the least-squares fit against the smoothness of the solution by instead solving the regularized least-squares problem
The role of the regularizer is to promote smooth changes in the elements of .
We can alternatively write the above minimization program in matrix notation. Define the -by- finite difference matrix
which when applied to a vector , yields a vector of the differences:
Then we can rephrase the regularization objective as
This allows for a reformulation of the weighted leas squares objective into a familiar least squares objective:
So the solution to the weighted least squares minimization program (9) satisfies the normal equation , which simplifies to
The following function generates the required finite-difference matrix:
finiteDiff(n) = diagm(0 => ones(n), +1 => -ones(n-1))[1:n-1,:]
Here's an example of a small finite-difference matrix:
finiteDiff(4)
3×4 Matrix{Float64}:
1.0 -1.0 0.0 0.0
0.0 1.0 -1.0 0.0
0.0 0.0 1.0 -1.0
The function below returns a generator, which produces logarithmically-spaced numbers in the interval :
LogRange(x1, x2, n) = (10^y for y in range(log10(x1), log10(x2), length=n))
Now solve the regularized least-squares problem for several values of :
D = finiteDiff(n)
b̂ = [b; zeros(n-1)]
plot(t, b, w=1, leg =:topleft, label="noisy data")
for λ in LogRange(1e0, 1e4, 3)
 = [ I; √λ*D ]
xLS = Â \ b̂
plot!(t, xLS, w=2, label="regularized solution: λ = $(λ)")
end
Note that the expression  = [ I; √λ*D ]
contains the UniformScaling object, which represents an identity matrix of any size.