This page is a general jumping-off point for organizing my thoughts about the [ AI research landscape ], where the field is, where it is…
When thinking about the [ reward ] function for a real-world AI system, there is always some causal process that determines reward. For…
This note is a scratchpad for investigating the expressivity of the [ transformer ] architecture. In general, one set of intuitions that we…
This note lists some ideas and directions for research I'm interested in or excited about. Some are more fleshed out than others, some more…
[ grokking ] / [ phase change hypothesis ] emergence of near-discrete features in large transformers symmetries / non-[ identifiable…
(see also: [ large models ]) There's a viewpoint that neural nets just memorize the training data, so the more training data you have, the…