Created:
Modified:

intrinsic motivation

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

objectives:

maximize entropy of the state visitation distribution
- requires empirical estimates of entropy
maximize mutual information between a random goal vector $z$ and the state distribution
- MI = entropy - conditional entropy, so you can view this as the previous maxent objective plus an extra term to minimize H[s | z] - we want to explore lots of states overall, but each individual goal should lead to narrow state distributions
- papers: variational intrinsic control (2016), ..., active pretraining with successor features (2021)

How to make intrinsic motivation safe? Suppose we're trying to maximize entropy of the state distribution. A few issues:

The apparent entropy depends on the state representation: e.g., you can maximize entropy by learning a representation that focuses on noise. So on its own this doesn't seem like a great objective for representation learning.
this incentivizes the agent to do all kinds of crazy shit, like a sociopathic kid who vivisects animals just to see what happens.

Is there work on this? Seems like an important area for AI safety.

intrinsic motivation

Links to this note

intrinsic motivation

predictive agent

large control policies

macrostate

many models

Meta