Created: February 01, 2022
Modified: February 10, 2022
Modified: February 10, 2022
explicit models of uncertainty
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.(note: this is dancing around the issues around why I think probabilistic programming is not AI research, even if it will be a tremendously useful tech in its own right. I should develop this distinction of 'magic AI' versus 'engineering AI', see in silico)
- Do I think explicit modeling of uncertainty is necessary for AI?
- Now, I'll grant right away that explicit modeling of uncertainty is absolutely necessary in a lot of technical engineering work. For example, NameRedacted's thesis is that predictions of click through rates need to incorporate uncertainty, because we're using them to make decisions, and decision theoretically, in order to maximize expected utility you have to be able to compute expectations, so you have to have a probability distribution.
- Now, on the other hand, let's look at things in the past in my lifetime that are felt like real AI: step changes in the types of things that computers can do.
- In the 90s Deep Blue became the world chess champion.
- Much later, AlexNet and the rise of deep learning in vision and audio.
- AlphaGo and the general maturing of game AI.
- And now recently there have been transformers and GPT, in particular, I think has really surprised people by the depth of behavior that you can get with just language modeling and maximum likelihood training.
- All of these advances have some relationship to uncertainty, but I would contend that in no case was explicit modeling of uncertainty necessary. The frameworks used for these systems are compatible with uncertainty modeling, but the core advances that enabled new capabilities are essentially orthogonal.
- Let's consider GPT-3 as the most recent and impressive example. It's certainly important that it's trained to produce probabilities, maximizing the likelihood of observed data. But isn't explicitly not a probabilistic programming system, and it doesn't have an explicit model of the world. It certainly isn't Bayesian in a computational sense---it doesn't maintain any explicit probably distribution over models. It has 175 billion weights, but it's not uncertain about what those weights are.
- So of course research, often involves speculating about things that haven't happened yet. You know, if we knew how to use uncertainty to solve AI, then there wouldn't be research anymore. So it's valid to have the hypothesis that explicit modeling of uncertainty is going to be necessary, even if it hasn't been yet. Now, where does the rubber meet the road on this vision? You could say: GPT is not a decision-making system, and actual decision-making systems may need to think more explicitly in terms of probabilities. This might not be necessary to get to human performance---I feel like I don't reason explicitly about uncertainty most of the time---but when I want to do really superhuman things that are bigger than the decisions that are bigger than humans evolved to make on our own. Things like organizational strategy that affect many people and involve many abstract considerations, so you do start to need to rely on the formal theory of decision making, because I don't have enough data to have built intuitions that I can trust to make those kinds of high stakes decisions.
- Of course this is a hypothesis. Ultimately, there is some optimal strategy for generalizing from small data, and if you train a system. You can do enough meta learning, then your system will learn to develop the formal decision theory, or maybe something better. Of course, relying on your system to develop the formal theory is kind of an abdication of the research process. We need to put something into the system. So what are we going to put in?