Modified:
language model
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.Links to this note
reversal curse
References: The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" https://arxiv.org/abs/2309.12288 Studying Large Language…
intelligence forklift
Boaz Barak writes in GPT as an "Intelligence Forklift." that [ language model ]s seem to function effectively as [ tool AI ] that can…
predictive agent
Consider an agent that is purely concerned with [ predictive processing ]: finding the optimal [ compression ], or equivalently the optimal…
embedded agent
Notes on Abram Demski and Scott Garrabrant's sequence on Embedded Agency Embedded Agents : Classic models of rational [ agency ], such as…
simulator AI
References: https://generative.ink/posts/simulators/ It seems pretty clear that the intelligence emerging from [ language model ]s is not…
training for consistency
These days we think a lot about using data to train large [ language model ]s. But there's only so much data in the world; eventually we'll…
research idea
This note lists some ideas and directions for research I'm interested in or excited about. Some are more fleshed out than others, some more…
transformer
The core of the transformer architecture is multi-headed [ attention ]. The transformer block consists of a multi-headed attention layer…
explicit models of uncertainty
(note: this is dancing around the issues around why I think [ probabilistic programming is not AI research ], even if it will be a…
classification is special
The [ distinction ] between classification and regression is, from one point of view, arbitrary: it's all just function approximation, and…