The state transitions we observe in [ reinforcement learning ] are typically correlated over time, both within a trajectory (obviously) and…
(see also my [ deep RL notes ] from John Schulman's class several years ago, which cover much of the same material) We can approach…