Deep deterministic policy gradient (DDPG) is an interesting RL algorithm with a somewhat misleading name. Although its name indicates that…
For any reward function and policy , consider the entropy-regularized reward Taking as our objective the (expected, discounted…