Hamiltonian: Nonlinear Function
Created: May 08, 2020
Modified: July 14, 2022

Hamiltonian

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.
  • What are Hamiltonian dynamics?
  • A system has phase space coordinates (p, q), representing momenta and positions respectively. How do these evolve over time?
  • The Hamiltonian is any function H(p,q,t)H(p, q, t) such that
dpdt=Hdqdqdt=Hdp\frac{dp}{dt} = - \frac{\partial H}{dq}\qquad\frac{dq}{dt} = \frac{\partial H}{dp}
  • Often this is the total energy of the system, i.e., sum of kinetic energy (a function of the momenta pp) and potential energy (function of positions qq).

The term dqdt\frac{dq}{dt} on the right is just the velocity. That is exactly the derivative of the kinetic energy (mv2mv^2) with respect to momentum (mvmv).

On the left, dpdt\frac{dp}{dt} is the change in momentum: the force on our system. We're saying this is the negative gradient of the potential energy. When potential energy is increasing (has positive gradient), it means we're going uphill and our momentum will decrease. And vice versa.

  • The log-density function that we write in ML is just the potential energy portion of this. That's because there is some theorem ?? that the probability that Hamiltonian dynamics ends up in a state with potential energy xx is proportional to exe^{-x}, so we can simulate from any distribution by simulating from the Hamiltonian logp(x)+12kv2-\log p(x) + \frac{1}{2}kv^2.
  • In QM, the Hamiltonian is an 'operator' on wave functions. What does that mean?
  • Qs:
    • when in physics history were these developed, and why? what are the advantages of this approach?
    • what are the alternatives and when does it make sense to use them?
    • Why does Hamiltonian mechanics sample from epotential energye^{-\text{potential energy}}?
    • What's the analogy to machine learning?

From Lagrangian

The Hamiltonian is the Legendre transform of the Lagrangian. If we think about Lagrangian mechanics as a constrained optimization with objective L(x,v)\mathcal{L}(x, v) s.t. x˙=v\dot{x} = v, then construct that objective with a Lagrange multiplier ρ\rho, now the dual objective (in the optimization sense) is the one in which we optimize out vv, so we have

doptimization[L](x,ρ)=(minvL(x,v)+ρ(x˙v))=maxvρ(vx˙)L(x,v)=maxuρuL(x,u+x˙)=maxuρuLcentered(x,u)=dLegendre[Lcentered](x,ρ)\begin{align*} -d_\text{optimization}[\mathcal{L}](x, \rho) &= -\left(\min_v \mathcal{L}(x, v) + \rho(\dot{x} - v)\right)\\ &= \max_v \rho(v - \dot{x}) - \mathcal{L}(x, v)\\ &= \max_u \rho u - \mathcal{L}(x, u + \dot{x})\\ &= \max_u \rho u - \mathcal{L}_\text{centered}(x, u)\\ &= d_\text{Legendre}[\mathcal{L}_\text{centered}](x, \rho)\\ \end{align*}