Homework 1
Review: Linear Algebra, Calculus, and Limits
- See the syllabus for the due date
- Follow the homework submission instructions
- There are no late days allowed for this assignment. You must turn in this assignment by the deadline to receive credit.
- You may not collaborate with other students on this assignment.
- Otherwise, follow the standard submission instructions.
Linear Algebra
Use these vectors and matrices to answer the questions below. Some of the questions are meant to be answered by writing Julia code, and in that case, show the commands you use. Others require you to write a short answer.
P1. rank of A
P2. Solve the equation \(Ax=b\) using A\b
.
P3. Solve the equation \(Ax=b\) by factorizing \(A\) as \(A=LU\), then solving \(LUx=b\). Compute the factorization using lu(A)
.
P4. Determine the set of points \(\mathcal{L}\) that contains all scalings (positive and negative) of the vector \(b\).
P5. Compute the projection of the vector \(s\) onto the set \(\mathcal{L}\).
Calculus
P6. (Derivatives) Compute the first and second derivatives of these functions. We normally use the natural log, unless specified otherwise.
- \(f(x) = x^2 + 2x + 1\) (and report the minimizer)
- \(f(x) = \sin(x) + \cos(x)\)
- \(f(x) = \exp(x) + \log(x)\)
- \(f(x) = \sum_{i=1}^m \log(1+\exp(-a_i x))\)
P7. (Gradients) Compute the gradient function \(\nabla f(x)\) for these functions. Here, the \(a\) is an \(n\)-vector, \(H\) is an \(n\)-by-\(n\) symmetric matrix, and \(A\) is an \(m\)-by-\(n\) matrix (square or rectangular).
- \(f(x) = \langle a, x\rangle + \beta\)
- \(f(x) = \langle x, Hx\rangle \equiv x^T H x\)
- \(f(x) = \tfrac12\|Ax-b\|^2\)
- \(f(x) = \log\sum_{i=1}^m \exp(\langle a_i, x\rangle)\)
The inner product between two vectors \(x\) and \(y\) of the same length is denoted \(\langle x, y\rangle=x^T y = \sum_i x_i y_i\). The inner product is a scalar. The inner product is also called the dot product because it is often denoted \(x\cdot y\).
In the quadratic form \(f(x) = \langle x, Hx\rangle\), we may assume that the matrix \(H\) is symmetric, i.e., \(H^T = H\) because otherwise we could replace \(H\) with \(\tfrac12(H+H^T)\) because \(\langle x, Hx\rangle = \langle x, \tfrac12(H+H^T)x\rangle\).
P8. Let \(f(x)=\sum_{j=1}^{2}\sin(\cos(x_j))\). Use automatic differentiation (AD) to compute the gradient \(\nabla f(x)\) at the point \(x=(0,\pi/2)\). Use the ReverseDiff.jl
package. You can install it with Pkg.add("ReverseDiff")
. You can read the documentation.
Sequences and rates of convergence
The rate of convergence measures tells us how fast a sequence converges to its limit (if one exists).
A sequence \(\{x_k\}\) converging to \(x^*\) is said to converge linearly with rate \(\mu\) if \[\lim_{k\to\infty} \frac{|x_{k+1}-x^*|}{|x_k-x^*|}=\mu\in(0,1).\] The sequence is said to converge superlinearly if \[\lim_{k\to\infty} \frac{|x_{k+1}-x^*|}{|x_k-x^*|}=0.\] The sequence is said to converge quadratically with rate \(\mu\) if \[\lim_{k\to\infty} \frac{|x_{k+1}-x^*|}{|x_k-x^*|^2}=\mu\in(0,\infty).\] Note that a sequence that converges quadratically also converges superlinearly, and a sequence that converges superlinearly also converges linearly. The converse implications do not hold.
For much more on the speed of sequence convergence, see How Many Steps Still Left to \(x^*\) (Cătinaş 2021) and Wikipedia.
P9. Derive the convergence order and the convergence rate and order for each of the sequences \(\{1/2^k\}\), \(\{2^{-2^k}\},\) and \(\{3^{-k^2}\}.\)
P10. Provide two separate plots of the values of the first 50 elements of these sequences: the first plots the sequence value vs \(k\); the second plots the log of the sequence value vs \(k\). The latter plot is called a log-linear, or semi-log, plot.
Plots.jl
package. You can install it with Pkg.add("Plots")
. This tutorial may be helpful.