minimax duality: Nonlinear Function
Created: July 06, 2022
Modified: July 07, 2022

minimax duality

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Considering a bilevel optimization problem (or saddle point problem) on the two-argument function f(x,y)f(x, y), in general it holds that

maxxXminyYf(x,y)minyYmaxxXf(x,y).\max_{x\in\mathcal{X}} \min_{y\in\mathcal{Y}} f(x, y) \le \min_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}} f(x, y).

That is, the inner optimization is at least as 'powerful' as the outer optimization. The difference between the two sides of the inequality is the duality gap.

We might view this as a two-player zero-sum game: the outer optimizer has to move first (choosing a row of the payoff matrix) and the inner optimizer gets to see their move and choose the best response. Going second gives strictly more information with which to make your move, so you always prefer it.

For example, suppose that f(x,y)=xyf(x, y) = \|x - y\| measures the distance between xx and yy. Then the LHS of the above allows the minimizer to go second, choosing x=yx = y to obtain the objective

maxxXminyYf(x,y)=maxxXf(x,x)=0\max_{x\in\mathcal{X}} \min_{y\in\mathcal{Y}} f(x, y) = \max_{x\in\mathcal{X}} f(x, x) = 0

while the RHS allows the maximizer to go second, choosing an arbitrarily different value xy+x \to y + \infty so that

minyYmaxxXf(x,y)=minyYf(y+,y)=.\min_{y\in\mathcal{Y}}\max_{x\in\mathcal{X}} f(x, y) = \min_{y\in\mathcal{Y}} f(y + \infty, y) = \infty.

Saddle-point property

We say that x~,y~\tilde{x}, \tilde{y} form a saddle pointUnlike the usual calculus definition, this definition is a global property that does not require ff to be differentiable. of ff if

f(x,y~)f(x~,y~)f(x~,y)    x,y.f(x, \tilde{y}) \le f(\tilde{x}, \tilde{y}) \le f(\tilde{x}, y) \;\;\forall x, y.

This is equivalent to the condition that x~\tilde{x} maximizes f(,y~)f(\cdot, \tilde{y}) and y~\tilde{y} minimizes f(x~,)f(\tilde{x}, \cdot),

f(x~,y~)=maxxf(x,y~)=minyf(x~,y)f(\tilde{x}, \tilde{y}) = \max_x f(x, \tilde{y}) = \min_y f(\tilde{x}, y)

If a saddle point exists, it follows that the minimax inequality is tight (equivalently, the duality gap is zero - sometimes called strong duality):

maxxminyf(x,y)=minymaxxf(x,y)\max_x \min_y f(x, y) = \min_y \max_x f(x, y)

with both sides obtaining the value f(x~,y~)f(\tilde{x}, \tilde{y}).

Why? From the definition of the saddle point, it follows that

f(x~,y~)minymaxxf(x,y)f(\tilde{x}, \tilde{y}) \ge \min_y \max_x f(x, y)

since the outer minimizer can always choose y=y~y = \tilde{y} to get a value of f(x~,y~)f(\tilde{x}, \tilde{y}), so it will always do at least that well at minimizing the value. A symmetric argument shows that

f(x~,y~)maxxminyf(x,y).f(\tilde{x}, \tilde{y}) \le \max_x \min_y f(x, y).

Putting these facts together, we derive the opposite of the minimax inequality,

minymaxxf(x,y)maxxminyf(x,y)\min_y \max_x f(x, y) \le \max_x \min_y f(x, y)

which implies our conclusion.

The converse is also true: strong duality implies that the common value is obtained at a saddle point x,yx^*, y^*, where x=argmaxxminyf(x,y)x^* = \arg\max_x \min_y f(x, y) is primal optimal and y=argminymaxxf(x,y)y^* = \arg\min_y \max_x f(x, y) is dual optimal.