intrinsic motivation: Nonlinear Function
Created:
Modified:

intrinsic motivation

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

objectives:

  • maximize entropy of the state visitation distribution
    • requires empirical estimates of entropy
  • maximize mutual information between a random goal vector zz and the state distribution
    • MI = entropy - conditional entropy, so you can view this as the previous maxent objective plus an extra term to minimize H[s | z] - we want to explore lots of states overall, but each individual goal should lead to narrow state distributions
    • papers: variational intrinsic control (2016), ..., active pretraining with successor features (2021)

How to make intrinsic motivation safe? Suppose we're trying to maximize entropy of the state distribution. A few issues:

  • The apparent entropy depends on the state representation: e.g., you can maximize entropy by learning a representation that focuses on noise. So on its own this doesn't seem like a great objective for representation learning.
  • this incentivizes the agent to do all kinds of crazy shit, like a sociopathic kid who vivisects animals just to see what happens.

Is there work on this? Seems like an important area for AI safety.