UBC CPSC 406 - Convex Optimality

Convex Optimality

optimality for convex problems
normal cone
Lagrange multipliers for linearly constrained problems

Optimality

\[ \min_x \set{f(x) \mid x \in C} \]

$f: \Rn \to \Re$ is convex differentiable
$C \subseteq \Rn$ is convex
$x^*$ is optimal if all feasible directions are non-increasing in $f$
if $C=\Rn$ the problem is unconstrained

\[ x^* \in \argmin_{x\in\Rn} f(x) \iff 0\le f'(x^*, d) = \nabla f(x^*)^Td \quad\text{for all}\quad x^*+d\in\Rn \]

implies $\nabla f(x^*) = 0$

Optimality – constrained

\[ x^* \in \argmin_{x\in C} f(x) \iff 0\le f'(x^*, x-x^*) = \nabla f(x^*)^T(x-x^*) \quad \forall x\in C \]

does not imply $\nabla f(x^*) = 0$

Normal cone

The normal cone to the set $C\subset\Rn$ at the point $x\in C$ is the set

\[ \mathcal{N}_C(x) = \set{d\in\Rn \mid d^T(z-x) \leq 0 \quad \forall z\in C} \]

$\mathcal{N}_C(x_1)$ is the normal to supporting hyperplane $H_1 = \set{z\in\Rn\mid d^T z\le d^T x_1}$
$\mathcal{N}_C(x_2) = \set{0}$ because $x_2$ is an interior point
$\mathcal{N}_C(x_3)$ is the cone of normals at the vertex $x_3$

Example

\[ \min_{x\in\R_+} \half(x_1-1)^2 + \half(x_2+1)^2 \]

Solution and gradient:

\[ x^* = \begin{bmatrix}1\\0\end{bmatrix} \qquad \nabla f(x^*) = \begin{bmatrix}x^*_1-1\\x^*_2+1\end{bmatrix} = \begin{bmatrix}0\\1\end{bmatrix} \]

Normal cone at $x^*(1, 0)$:

\[ \mathcal{N}_{\R_+^2}(x^*) = \Set{\lambda\begin{bmatrix}\phantom-0\\-1\end{bmatrix} : \lambda\ge0} \]

Optimality:

\[ -\nabla f(x^*) \in \mathcal{N}_{\R_+^2}(x^*) \]

Necessary and sufficient optimality

a point $x^*\in\argmin_{x\in C} f(x)$ if and only if

\[ \nabla f(x^*)^T(x-x^*) \geq 0 \quad \forall x\in C \]

Use the definition of the normal code to deduce the equivalent condition

\[ -\nabla f(x^*) \in \mathcal{N}_C(x^*) \]

Interior point

a point $x$ is in the interior of $C$ (ie, $x\in\mathop{\rm int} C$) if all directions are feasible, ie,

\[ x + \epsilon d\in C \quad \text{$\forall d\in\Rn$ and $\epsilon>0$ small} \]

if $g\in\mathcal{N}_C(x)$ and $x\in\mathop{\rm int} C$ then for every direction $d$,

\[ \begin{aligned} 0 \le g^T(z-x) &= \phantom+\epsilon g^T d & \text{for all}\quad z=x+\epsilon d\in C \\0 \le g^T(z-x) &= -\epsilon g^T d & \text{for all}\quad z=x-\epsilon d\in C \end{aligned} \]

together, these imply $g=$, and thus

\[ x\in\mathop{\rm int} C \implies \mathcal{N}_C(x) = \set{0} \]

[aside; the opposite implication is also true, but requires the supporting hyperplane theorem.]

unconstrained optimality:

\[ x^*\in\argmin_{x\in\Rn} f(x) \quad \iff\quad -\nabla f(x^*)\in\mathcal{N}_C(x)=\set{0} \quad\iff\quad \nabla f(x^*) = 0 \]

Normal cone to an affine set

\[ C = \set{x\in\Rn \mid Ax=b}, \quad A\in\R^{m\times n}, \quad b\in\R^m \]

For any $x\in C$, define the translated set

\[ C_x = \set{z-x\mid z\in C} = \Null(A) \]

Then, \[ \begin{aligned} \mathcal{N}_C(x) &= \set{g\mid g^T(z-x) \leq 0 \quad \forall z\in C}\\[10pt] &= \set{g\mid g^Td \leq 0 \quad \forall d\in C_x}\\[10pt] &= \set{g\mid g^Td \leq 0 \quad \forall d\in \Null(A)}\\[10pt] &= \set{g\mid g^Td = 0 \quad \forall d\in \Null(A)}\\[10pt] &= \range(A^T) \end{aligned} \]

Application: Linearly constrained optimization

\[ \min_{x\in\Rn} \set{f(x) \mid Ax=b} \]

a point $x\in C=\set{x\mid Ax=b}$ is optimal if and only if

\[ - \nabla f(x) \in \mathcal{N}_C(x^*) = \range(A^T)\\ \]

or, equivalently,

\[ \nabla f(x) = A^T y \quad \text{for some $y\in\R^m$} \]

the vector $y=(y_1,\ldots,y_m)$ contains the Lagrange multipliers for each constraint $a_i^T x = b_i$