Looped Transformers as Programmable Computers: Nonlinear Function
Created: February 12, 2023
Modified: February 13, 2023

Looped Transformers as Programmable Computers

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Some ideas from this paper.

Binary positional encodings. Memory index ii is represented by a binary vector pi{1,1}log(n)p_i \in \{-1, 1\}^{\log(n)} . This has the property that piTpi=lognp_i^Tp_i = \log n for any ii, and piTpjlogn1p_i^Tp_j \le \log n - 1 for any iji \ne j.