convex: Nonlinear Function
Created: June 25, 2022
Modified: July 09, 2022

convex

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

A convex function ff satisfies the property that a line between any two points on its graph is on or above the graph:

tf(x)+(1t)f(y)f(tx+(1t)y)t f(x) + (1-t) f(y) \ge f(tx + (1-t)y)

for any t[0,1]t \in [0, 1]. It is strictly convex if this inequality is strict. Wikipedia has this nice illustration of the definition:

Illustration of a line connecting two points of a convex function.

A slight generalization of this definition, E[f(x)]f(Ex)E[f(x)] \ge f(Ex), is sometimes called Jensen's inequality.

The second derivative of a differentiable convex function must be nonnegative; that is, the derivative must be non-decreasing. In the multivariate setting, the Hessian of a differentiable convex function must be positive semidefinite.

A convex function is always greater than or equal to its first-order Taylor expansion. A function is λ\lambda-strongly convex if it exceeds its linear approximation by a quadratic quantity: f(y)f(x)+(yx)f(x)+λ2yx22f(y) \ge f(x) + (y-x)\nabla f(x) + \frac{\lambda}{2} \|y-x\|_2^2. An equivalent condition is that fλ2x22f - \frac{\lambda}{2}\|x\|_2^2 is convex.