Created: September 07, 2020
Modified: March 07, 2022

Bregman divergence

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

For any strictly convex function $\Phi$ , define the Bregman divergence:

D_\Phi(x, y) = \Phi(x) - \Phi(y) - \langle x - y, \nabla \Phi(y) \rangle

Examples:

(Squared) Euclidean distance: choose the squared norm ( $\Phi(x) = \|x\|^2$ ).

Kullback-Leibler (KL) divergence: choose the negative entropy ( $\Phi(p) = \sum_i p_i \log p_i$ ).

Bregman divergence

Links to this note

mirror descent

mirror descent implementations

Meta