Convex Functions

CPSC 406 – Computational Optimization

Convex functions

  • definition
  • examples
  • restriction to lines
  • operations that preserve convexity
  • first and second-order characterization

Convex functions

  • \(f:\mathcal{C}\to\R\) is convex if \(\mathcal{C}\subset\Rn\) is convex and for all \(x,y\in\mathcal{C}\) and \(\theta\in[0,1]\),

\[ f(\theta x + (1-\theta)y) \le \theta f(x) + (1-\theta)f(y) \]

  • \(f\) is strictly convex if the inequality is strict for \(x\ne y\) and \(\theta\in(0,1)\)

\[ f(\theta x + (1-\theta)y) < \theta f(x) + (1-\theta)f(y) \]

  • \(f\) is concave if \((-f)\) is convex

Examples

Convex functions

  • exponential: \(e^{ax}\) for any \(a\in\R\)
  • powers: \(x^\alpha\) over \(x\ge0\) for any \(\alpha\ge1\) or \(\alpha\le0\)
  • abs val: \(|x|^\alpha\) for any \(\alpha\ge1\)
  • norms: \(\|x\|_p\) for any \(p\ge1\)


Concave functions

  • powers: \(x^\alpha\) over \(x\le0\) for any \(0<\alpha<1\)
  • logarithm: \(\log x\) over \(x>0\)


Convex and concave

  • affine: \(a^Tx + \beta\) for any \(a\in\Rn\) and \(\beta\in\R\)

Restriction to lines

\(f:\Rn\to\R\) is convex if and only if \[ \phi(\alpha) = f(x+\alpha d) \] is convex over \(\alpha\in\R\) for all points \(x\) and directions \(d\)

 

Example. Quadratic functions \[ f(x)=\half x^TAx+b^Tx+\gamma \] are convex if and only if \(A\succeq 0\)

Operations that preserve convexity

  • nonnegative scaling

\[ \alpha f \quad\text{is convex if}\quad f\quad\text{is convex and}\quad \alpha\ge 0 \]

  • sum (including infinite sums)

\[ f_1+f_2 \quad\text{is convex if}\quad f_1,f_2\quad\text{are convex} \]

  • composition with affine function

\[ f(Ax+b) \quad\text{is convex if}\quad f\quad\text{is convex} \]

Examples

  • \(f(x) = \|Ax-b\|\)
  • \(f(x) = -\sum_{i=1}^m\log(b_i-a_i^Tx)\) over \(\set{x\mid a_i^Tx<b_i}\)
  • \(f(x_1, x_2, x_3) = \exp(x_1-x_2+x_3)+\exp(2x_2) + x_1\)

Question

When is \(f(x) = x^\alpha\) convex over \(x\ge0\)?

  1. \(\alpha\ge 0\)
  2. \(\alpha \le 0\) or \(\alpha\ge1\)
  3. \(\alpha\ge1\)
  4. \(\alpha \ge 0\) or \(\alpha\le1\)

Question

Prove that the log-sum-exp function

\[ f(x) = \log\left(\sum_{i=1}^m e^{a_i^Tx}\right) \]

is convex over \(\Rn\).

Convex optimization

\[ \min_{x\in\mathcal{C}}\ f(x), \quad \text{$\mathcal{C}\subset\Rn$ convex},\quad\text{$f:\mathcal{C}\to\R$ convex} \]

If \(x^*\) is a local minimizer, it’s also a global minimizer, ie,

\[ f(x^*)\le f(x)\ \forall x\in\mathcal{C}\cap\epsilon𝔹(x^*) \quad\Longrightarrow\quad f(x^*)\le f(x)\ \forall x\in\mathcal{C} \]

Proof

Suppose \(\bar x\) is a local but not global minimizer. Then,

  • there exists \(y\in\mathcal{C}\) such that \(f(y)<f(\bar x)\)
  • examine line between \(\bar x\) and \(y\), ie, for \(\theta\in[0,1]\),

\[ \begin{aligned} f(\theta \bar x + (1-\theta)y) &\le \theta f(\bar x) + (1-\theta)f(y) & \text{(convexity)} \\ &= \theta f(\bar x) + (1-\theta)f(\bar x) & \text{(hypothesis)} \\ &= f(\bar x), \end{aligned} \]

  • contradicts hypothesis

Level sets

The level set of \(f:\Rn\to\R\) at level \(α\in\R\)

\[ [f\le\alpha] := \set{x\in\Rn\mid f(x)\le\alpha} \]

  • if \(f\) is convex \(\quad\Longrightarrow\quad\) all level sets are convex

Proof

  • take \(x,y\in[f\le\alpha]\), then \(f(x)\le\alpha\) and \(f(y)\le\alpha\)
  • because \(f\) is convex, for all \(\theta\in[0,1]\)

\[ f(\theta x + (1-\theta)y) \le \theta f(x) + (1-\theta)f(y) \le \theta\alpha + (1-\theta)\alpha = \alpha \]

  • thus, \(\theta x + (1-\theta)y\in[f\le\alpha]\)

Corollary

  • the set of minimizers of \(f\) over \(\mathcal{C}\) is convex

First-order characterization

  • Let \(f:\mathcal{C}\to\R\) be differentiable over \(\mathcal{C}\subset\Rn\). Then \(f\) is convex if and only if

\[ f(y) \ge f(x) + \nabla f(x)^T(y-x) \quad \forall x,y\in\mathcal{C} \]

  • implies \(x^*\) is a global min if \(\nabla f(x^*)=0\) (ie, stationarity is sufficient for global optimality)

 

Second-order characterization

  • Let \(f:\mathcal{C}\to\R\) be twice differentiable over \(\mathcal{C}\subset\Rn\). Then \(f\) is convex if and only if

\[ \nabla^2f(x)\succeq 0 \quad \forall x\in\mathcal{C} \]

\[ f(x) = x^\alpha \quad\text{over}\quad x\in\Re_+ \] Over what values of \(\alpha\) is \(f\)

  1. convex?
  2. concave?