Created: July 06, 2022
Modified: July 07, 2022

minimax duality

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Considering a bilevel optimization problem (or saddle point problem) on the two-argument function $f(x, y)$ , in general it holds that

\max_{x\in\mathcal{X}} \min_{y\in\mathcal{Y}} f(x, y) \le \min_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}} f(x, y).

That is, the inner optimization is at least as 'powerful' as the outer optimization. The difference between the two sides of the inequality is the duality gap.

We might view this as a two-player zero-sum game: the outer optimizer has to move first (choosing a row of the payoff matrix) and the inner optimizer gets to see their move and choose the best response. Going second gives strictly more information with which to make your move, so you always prefer it.

For example, suppose that $f(x, y) = \|x - y\|$ measures the distance between $x$ and $y$ . Then the LHS of the above allows the minimizer to go second, choosing $x = y$ to obtain the objective

\max_{x\in\mathcal{X}} \min_{y\in\mathcal{Y}} f(x, y) = \max_{x\in\mathcal{X}} f(x, x) = 0

while the RHS allows the maximizer to go second, choosing an arbitrarily different value $x \to y + \infty$ so that

\min_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}} f(x, y) = \min_{y\in\mathcal{Y}} f(y + \infty, y) = \infty.

Saddle-point property

We say that $\tilde{x}, \tilde{y}$ form a saddle pointUnlike the usual calculus definition, this definition is a global property that does not require $f$ to be differentiable. of $f$ if

f(x, \tilde{y}) \le f(\tilde{x}, \tilde{y}) \le f(\tilde{x}, y) \;\;\forall x, y.

This is equivalent to the condition that $\tilde{x}$ maximizes $f(\cdot, \tilde{y})$ and $\tilde{y}$ minimizes $f(\tilde{x}, \cdot)$ ,

f(\tilde{x}, \tilde{y}) = \max_x f(x, \tilde{y}) = \min_y f(\tilde{x}, y)

If a saddle point exists, it follows that the minimax inequality is tight (equivalently, the duality gap is zero - sometimes called strong duality):

\max_x \min_y f(x, y) = \min_y \max_x f(x, y)

with both sides obtaining the value $f(\tilde{x}, \tilde{y})$ .

Why? From the definition of the saddle point, it follows that

f(\tilde{x}, \tilde{y}) \ge \min_y \max_x f(x, y)

since the outer minimizer can always choose $y = \tilde{y}$ to get a value of $f(\tilde{x}, \tilde{y})$ , so it will always do at least that well at minimizing the value. A symmetric argument shows that

f(\tilde{x}, \tilde{y}) \le \max_x \min_y f(x, y).

Putting these facts together, we derive the opposite of the minimax inequality,

\min_y \max_x f(x, y) \le \max_x \min_y f(x, y)

which implies our conclusion.

The converse is also true: strong duality implies that the common value is obtained at a saddle point $x^*, y^*$ , where $x^* = \arg\max_x \min_y f(x, y)$ is primal optimal and $y^* = \arg\min_y \max_x f(x, y)$ is dual optimal.

minimax duality

Saddle-point property

Links to this note

Legendre transform

Lagrange multiplier

Meta