Recap & context

Last lecture we established the importance of convexity in classical optimization and saw a simpler subgradient algorithm for general convex optimization
Here we continue developing algorithms for convex optimization, but in the more general model of online learning / online optimization
We will cover the online convex optimization framework, regularization, strong convexity, and online to batch conversion

Online Convex Optimization

Online optimization model where loss functions $f_1,\ldots,f_T$ are convex functions over a convex decision set $W \subseteq \R^d$
Recall: modeled as a game between learner/player and environment/adversary

<aside> 🚧 Online convex optimization (OCO)

For $t=1,2,\ldots,T$:

player/learner choses decision $w_t \in W$
adversary chooses convex loss function $f_t : W \mapsto \R$
player incurs loss $f_t(w_t)$ and observes $f_t$ as feedback

Player’s goal is to minimize regret:

$$ R_T = \sum_{t=1}^T f_t(w_t) - \min_{w^* \in W} \sum_{t=1}^T f_t(w^*) $$

</aside>

equivalently, we can denote the adversary’s choice by $z_t \in \Z$ and loss functions by $f_t(w) = f(w,z_t)$ for a fixed and known in advance $f : W \times Z \to \R$
regret is comparative benchmark**:** goal is to compete with single best decision $w^*$ in hindsight
if $R_T = o(T)$ (equivalently $R_T/T \to 0$ as $T \to \infty$) then player’s performance (in terms of instantaneous loss) improves with time