Created: December 29, 2023
Modified: December 29, 2023
Modified: December 29, 2023
maximal update parameterization
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.References:
- Hu, Yang (2022) Feature Learning in Infinite-Width Neural Networks https://arxiv.org/abs/2011.14522
- Yang, Hu et al. (2022) Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer https://arxiv.org/abs/2203.03466