the best things have many stories: Nonlinear Function
Created: October 02, 2020
Modified: January 23, 2022

the best things have many stories

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.
  • I used to think that there was a 'best' way to motivate an area. For example, in VI, the ELBO is derived from the KL divergence between a true Bayesian posterior and prior. There are other motivations---from convex analysis, or importance weighting, or minimum description length --- but I thought these were fundamentally a bit confused or beside the point. (maybe MDL less so than the others)
  • But in fact: the existence of multiple independent stories is itself what makes the ELBO powerful. Recognizing the existence of many models, if several models all agree in a particular place, it's likely to be important, the sort of thing that carves the world at the joints.
  • What other things have many stories?
    • Gaussian processes: infinite-dimensional Gaussians, kernelized linear regression, infinite-width neural networks.