Created: September 07, 2020
Modified: September 07, 2020
Modified: September 07, 2020
mirror descent implementations
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.- What pieces of mirror descent can we automate?
- See also natural gradient implementations
- Given a mirror function , we can compute the mirror map , and the Bregman divergence .
- Can we compute the convex conjugate ? Of course you could optimize, but I don't think we could get an exact version.
- Can we do Bregman projection ? Again, of course we can optimize, but we're not going to get this in closed form for arbitrary and .
- How specifically does exponential-family mirror descent work?
- Hypothetically we start with a mean param and dual . We compute and then apply the dual update . We then convert this back to, minimizing the Bregman divergence to the desired natural parameter within the space of realizeable marginals.
- It seems like this assumes that our natural param was 'valid'? But in general we can get a natural param that doesn't normalize. Is that okay? It seems like the update above would still be well defined.
- Hypothetically we start with a mean param and dual . We compute and then apply the dual update . We then convert this back to