Modified: August 04, 2022
Wasserstein
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.The -Wasserstein distance between probability distributions is defined as
where the infimum is over all joint distributions having marginal distributions and over the first and second arguments respectively.
Any such joint distribution can be viewed as a 'transport plan'. The conditional distributions tell us how any probability mass at a sample should be 'transported' in order to produce a sample , and the expected distance between such pairs is the cost of the plan. So with , the Wasserstein distance represents the cost of the lowest-cost transport plan (the optimal transport) from to or vice versa.
For univariate probability distributions in Euclidean space, the Wasserstein metric is equivalently the metric on their inverse cumulative distribution functions,