CPSC 406 – Computational Optimization
\[ \def\argmin{\operatorname*{argmin}} \def\Ball{\mathbf{B}} \def\bmat#1{\begin{bmatrix}#1\end{bmatrix}} \def\Diag{\mathbf{Diag}} \def\half{\tfrac12} \def\int{\mathop{\rm int}} \def\ip#1{\langle #1 \rangle} \def\maxim{\mathop{\hbox{\rm maximize}}} \def\maximize#1{\displaystyle\maxim_{#1}} \def\minim{\mathop{\hbox{\rm minimize}}} \def\minimize#1{\displaystyle\minim_{#1}} \def\norm#1{\|#1\|} \def\Null{{\mathbf{null}}} \def\proj{\mathbf{proj}} \def\R{\mathbb R} \def\Re{\mathbb R} \def\Rn{\R^n} \def\rank{\mathbf{rank}} \def\range{{\mathbf{range}}} \def\sign{{\mathbf{sign}}} \def\span{{\mathbf{span}}} \def\st{\hbox{\rm subject to}} \def\T{^\intercal} \def\textt#1{\quad\text{#1}\quad} \def\trace{\mathbf{trace}} \]
\[ \def\Regret{\operatorname*{Regret}} \def\cN{\mathcal{N}} \def\eps{\varepsilon} \def\OPT{\mathrm{OPT}} \]
A introduction to Multiplicative Weights Update1 method from an optimization perspective
We feel that this meta-algorithm and its analysis are simple and useful enough that they should be viewed as a basic tool taught to all algorithms students together with divide-and-conquer, dynamic programming, random sampling, and the like. — Arora, Hazan, Kale, 2012
(…) so hard to believe that it has been discovered five times and forgotten. — Papadimitriou
Although taught usually from a purely algorithmic perspective, I (Victor) think the optimization perspective is insightful.
A player and an adversary play a game through \(T\) days/rounds. On each round \(t\):
Q: Can the player do well even against an evil adversary?
On round \(t\)
Attempt 1 - Total Loss
Player tries to minimize \(\displaystyle \sum_{t = 1}^T \ell_t \T p_t\)
BAD: adversary can make \(\ell_t = \begin{pmatrix} 1 &1 &\dotsm& 1 \end{pmatrix} \T\) always
Attempt 2 - Compare with the best of each round
Player tries to minimize \(\displaystyle \sum_{t = 1}^T \ell_t \T p_t - \sum_{t = 1}^T \min_{i \in [n]} \ell_t(i)\)
BAD: adversary can make the loss \(\ell_t\) to be -1 everywhere except at
Compare with the best expert of the game
Player tries to minimize the regret \[ \Regret(T) = \sum_{t = 1}^T \ell_t \T p_t - \min_{i \in [n]}\sum_{t = 1}^T \ell_t(i). \] Intuition: Player’s regret of not picking the best expert in hindsight every round.
Regret may always grow, but we want \[ \frac{\Regret(T)}{T} \to 0~\text{as}~T \to \infty \] In words, average regret goes to 0.
\[ \ell_1 = \bmat{0 \\ 1 \\ -1}, \ell_2 = \bmat{0 \\ -0.5 \\ 1}, \ell_3 = \bmat{-1 \\ 1 \\ 1}, \]
The Follow the Leader algorithm picks1 \[ p_{t+1} \in \argmin_{p \in \Delta_n} \Big\{\sum_{s = 1}^t \ell_s \T p \Big\}. \]
This is a very intuitive algorithm, but can fail terribly
Key problem: player changes their decision too abruptly
Pick \(p_1 = \frac{1}{n}e\), a step-size \(\alpha > 0\), and \[ p_{t+1} \in \argmin_{p \in \Delta_n} \Big\{\ell_t \T p + \frac{1}{2\alpha}\norm{p - p_{t}}_2^2 \Big\}. \]
Via optimality conditions, we have
\[ -\ell_t - \frac{1}{\alpha} (p_{t+1} + p_t) \in \cN_{\Delta_n}(p_{t+1}) \iff p_t -\alpha \ell_t - p_{t+1} \in \cN_{\Delta_n}(p_{t+1}) \]
Reminder: \(z = \proj_C(x) \iff x - z \in \cN_{C}(z)\).
\[ \implies p_{t+1} = \proj_{\Delta_n}(p_t - \alpha \ell_t) \] (Gradient descent step)
Theorem If \(\alpha = 1/\sqrt{nT}\), then \(\Regret(T) \leq \sqrt{nT}\).
Pick \(p_1 = \frac{1}{n}e\), step size \(\alpha > 0\), and update weights as \[ p_{t+1} \in \argmin_{p \in \Delta_n} \Big\{\sum_{s = 1}^t \ell_s \T p + \frac{1}{\alpha}\sum_{i = 1}^n p_i \ln p_i \Big\}. \]
Alternative view: Start with \(w_1 = e\). Then
Pick \(p_t = w_t/\norm{w_t}_1\)
Multiplicative update \(w_{t+1}(i) = w_t\cdot\exp(-\alpha \ell_t(i))\)
Entropy vs \(\ell_2\)-norm
Theorem If \(\alpha =\sqrt{ 2 \ln(n)/T}\),\(~\) then1 \(\Regret(T) \leq \sqrt{2T\ln n}\).
The game is define by a payoff matrix \(A \in [0,1]^{m \times n}\)
We have two players: the column player and the row player. In each round
The game is define by a payoff matrix \(A \in \R^{m \times n}\)
We have two players: the row player and the column player. In each round
\[ \sum_{i = 1}^m \sum_{j = 1}^n A(i,j) q_i p_j = q \T A p. \]
Question: Does it matter who plays first?
Von Neumann’s Minmax Theorem \[ \max_{p\in \Delta_n} \min_{q \in \Delta_m} q\T A p = \min_{q \in \Delta_m} \max_{p\in \Delta_n} q\T A p = \OPT \]
Let’s play many rounds with column player going first.
Define \(\displaystyle \bar{p} = \frac{1}{T} \sum_{t = 1}^T p_t\) and \(\displaystyle \bar{q} = \frac{1}{T} \sum_{t = 1}^T q_t\). Then:
\[\displaystyle \frac{\Regret(T)}{T} - \eps \leq \bar{q} \T A \bar{p} \leq \OPT + \frac{\Regret(T)}{T}\]
If \(T >2 \ln(n)/\eps^2\), then \(\displaystyle \OPT - \eps \leq \bar{q} \T A \bar{p} \leq \OPT + \eps\)