Modified: June 25, 2022
convex dual
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.See also:
- https://www2.sonycsl.co.jp/person/nielsen/Note-LegendreTransformation.pdf
- Jess Riedel on the Legendre transform in physics looks very useful, though I haven't read it yet.
For any function we define its convex conjugate or Legendre transform or Fenchel dual function to be
This is always convex because it is the supremum of linear functions of .
We can view this as the analogue of the Fourier transform in the semiring instead of the semiring. That is, instead of , where we:
- add a bunch of elements, each consisting of the product of with , we
- take the max of a bunch of elements, each consisting of the sum of with
- where is multiplication, viewable as repeated addition; analogously, is viewable as repeated multiplication. (I don't quite understand if/how the imaginary numbers play into this analogy).
A shared intuition is that we are 'projecting' a function onto a set of components; the transform tells us how much of each component there is. In the Fourier transform, the components are sinusoids, and the transform tells us how much content there is at each frequency . In the convex conjugate, the components are slopes; the transform tells us how much of the original function is at each slope .
For example, the line has conjugate . At this is zero; for all other it is infinity.
As Wikipedia says, "This definition can be interpreted as an encoding of the convex hull of the function's epigraph in terms of its supporting hyperplanes". TODO: is there a convex conjugate equivalent of the fast Fourier transform?
Note that is defined over the space of linear functionals operating on vectors . For a given linear functional , it tells us the maximum amount by which that functional exceeds (equivalently, the amount by which it must be shifted down in order to never exceed , i.e., to be a supporting hyperplane). If is strictly convex, then it must be at least slightly superlinear, so no fixed linear functional can exceed by an arbitrarily large amount---eventually, will catch up.
If is itself convex, then setting the gradient of the inner term to zero:
we can solve exactly for the conjugate evaluated at the gradient of some point :
Convex duality establishes a relationship between Lipschitz- continuous gradients and strong convexity.
- Recall that a function is -Lipschitz iff . For an everywhere-differentiable function this is true if and only if the gradients have bounded norm.
- Let be -Lipschitz. What can we say about its convex dual ?
- It's apparently true that the convex dual will be -strictly convex, and conversely, the convex dual of a -strictly convex function must be -Lipschitz.