Notes on the Alignment Forum's Value Learning sequence curated by Rohin Shah. ambitious value learning : the idea of learning 'the human…
The standard [ Markov decision process ] formalism includes a reward function ; the total (discounted) reward across a trajectory is its…
Silver, Singh, Precup, and Sutton argue that Reward is enough : maximizing a reward signal implies, on its own, a very broad range of…
Note : see [ reinforcement learning notation ] for a guide to the notation I'm attempting to use through my RL notes. Three paradigmatic…
There are three main approaches to moral philosophy: [ utilitarian ]ism: you should feed a starving person because it will increase 'global…
AI safety, as a term, is sterile and hard to get excited about. Preventing catastrophe is important, but doesn't motivate me, since [ the…