Considering training an [ autoregressive ] model of sequence data (text, audio, action sequences in [ reinforcement learning ], etc.), which…
References: Holtzman et al. (2020), The Curious Case of Neural Text Degeneration https://arxiv.org/abs/1904.09751 How should we actually…