Created: January 16, 2022
Modified: January 16, 2022

meta-level shape of machine learning

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Unlike most modern deep learning systems, humans:
- don't have separate training/test phases (though we may have wake/sleep)
- don't operate on iid inputs --- indeed, we get a lot of juice learning from the dependence structure in our inputs.
- may not make a hard distinction between train-time 'slow weights' and inference-time fast weights (though we do have short and long-term memory).
- don't optimize a specific objective; indeed, we can learn objectives.
- likely don't follow a single global gradient
- can rewrite almost all levels of our computational processes as part of the learning process

It might be useful to reflect on these differences.

Specific to reinforcement learning, we don't really have a notion of repeated terminating trajectories. A human gets only one trajectory, ever, and it's very very long. So the model-free RL setting

meta-level shape of machine learning

Links to this note

AI reflections master

transformers with memory

safe objective

values all the way down

rl with proxy objectives

computation is important

Meta