Notes on the Alignment Forum's Value Learning sequence curated by Rohin Shah. ambitious value learning : the idea of learning 'the human…
The idea is that a [ mesa optimizer|mesa-optimizing ] policy with access to sufficient information about the world (e.g., web search) might…
We often see optimization problems with objectives of the form where is the main function of interest (e.g., training loss in machine…
In modern ML, representation learning is the art of trying to find useful abstractions, embodied as encoding networks. We can learn…