superposition: Nonlinear Function
Created: September 14, 2022
Modified: September 14, 2022

superposition

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

A dd-dimensional vector can represent dd distinct orthogonal features, but due to the weirdness of high-dimensional geometry, it can represent O(exp(d))O(\exp(d)) almost-orthogonal features, which we define as feature vectors having a dot product of at most ϵ\epsilon. This allows for networks to represent a 'superposition' of exponentially many potential features in a single vector.

https://transformer-circuits.pub/2022/toy_model/index.html

https://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_lemma

An argument I like is that random vectors in high-dimensional space are orthogonal with high probability. (TODO: this argument is currently broken, need to remember/look up what's missing).

  1. If we treat each dimension as standard normal with variance 1/d1/d, then in high dimensions we'll get unit vectors with high probability.
  2. The dot product x,y=i=1dxiyi\langle x, y\rangle = \sum_{i=1}^d x_i y_i of two independent normal vectors is a sum of dd terms, each of which has mean zero and variance 1/d21/d^2, so the sum itself has mean zero and variance 1/d1/d . The probability of the sum being far from zero can be bounded by various concentration inequalities; naively we appeal to the central limit theorem to show that the sum is approximately normally distributed and thus
    P(x,y>ϵ)exp(dϵ22)\mathbb{P}\left(|\langle x, y\rangle| > \epsilon\right) \approx \exp\left(-\frac{d\epsilon^2}{2}\right)
  3. We want to make this event so unlikely that, given an exponential number of vectors, and considering all (e2d)(e^{2d}) pairs, a union bound still gives a low probability that any of the dot products has magnitude ϵ\ge \epsilon. Summing over all e2de^{2d} events gives
    exp(2d)exp(dϵ22)=exp(d2(4ϵ2))\exp(2d) \cdot \exp\left(-\frac{d\epsilon^2}{2}\right) = \exp\left(\frac{d}{2}\left(4 - \epsilon^2\right)\right)
    but this is not a useful bound since we are likely to want to take ϵ\epsilon much smaller than 2. So something in this argument needs to be strengthened.