AI for math: Nonlinear Function
Created: May 23, 2021
Modified: April 26, 2022

AI for math

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Doing math seems like a really promising area for AI. And by 'math' I mean math research (not arithmetic, which computers are already really good at). That can mean proving theorems, but also includes creativity: coming up with new mathematical concepts. In fact the two are unavoidably interrelated: it's often necessary to invent new concepts when proving a difficult theorem, and to prove theorems as part of defining a new abstraction.

There is no data scarcity in math research. Like Go or Atari games, it's a completely theoretical universe; we can simulate everything computationally. The world is fully observed; in some sense there is no world to even be uncertain about. These are the problems where AI has done really well.

Transformers seem to be really good at dealing with symbolic representations. Math is a symbolic representation.

Given the relative rates of progress in AI capabilities and in human mathematics research, I think it's likely that many of the remaining Millennium Prize problems, like the Riemann Hypothesis, or P vs NP, will ultimately be resolved by AI mathematicians. (things I believe that no one else believes)

Research into math AI also seems likely to generalize. If an architecture works well for mathematical and conceptual reasoning, then hooking it up to GPT should make it smarter and better at learning abstract subjects.

What are the big limitations? There are of course lots of open research questions, but imagining that we had time and funding to attack those, what would keep us from making progress?

  • Compute? Math isn't data-limited, but it is a difficult problem, and training large models is inherently expensive. As Andrej Karpathy said about the Tesla Autopilot project, their ultimate limit is not the amount of available data, but the need to keep the iteration time down to a couple of weeks.
  • Supervision. If we don't want to use formal verification (see below), we still have to ground the AI in the notion of 'truth' somehow. It could produce human-legible text, and be evaluated by a trained discriminator, but even training a discriminator to follow and judge the quality of mathematical reasoning is a nontrivial open problem.

What are not limitations?:

  • (Disputed) The state of mathematical formalization tools, like Coq or Lean. Sure, those can help an AI, in the same way that a human could use them, and the existence of large bodies of formalized math seems like a valuable source of training data, but I don't require that 'the AI learns algebraic geometry' is the same event as 'we formally verify algebraic geometry'. Humans learn math informally, as a set of intuitions and concepts and human-checkable proofs.
    • Importantly: in order to be really useful, the AI will have to write math papers. The goal of math is human understanding of math (https://arxiv.org/abs/math/9404236). A good proof isn't just formally verifiable, it tells you why something is true, better yet why the thing is interesting, and it gives you an intuition for what kinds of other things might be true. And even more important than good proofs are good definitions and good conjectures. We already know that P != NP. That's not the point. The point is to build better conceptual tools for reasoning about computing. A well-structured formal proof can do this, but it's neither necessary nor sufficient.
    • Counterpoint: formal proofs of hard things would have to be well-structured, since that's the only way we'll discover them in a reasonable amount of search. A formal proof of the Riemann hypothesis would necessarily contain a set of abstractions that humans could follow, just as we read code, and for big results like the Riemann hypothesis, people will be willing to try. Formal verification provides a real supervisory signal, and a natural representation, so it gives us a leg up on the problem even if we ultimately want to work in terms of written explanations. (but even still: the history of deep learning is that learned representations are better).

Getting good at math requires a facility with different types of mental models: geometric reasoning, linguistic reasoning, heuristic intuition. Any particular formal concept probably corresponds to multiple mental pictures and can be looked at with many lenses. And even purely formally, you end up working with many models: a particular structure may be (say) both a group and a vector space, each of which gives it some 'constraints' (or seen differently, provides it with a particular API).

I'd imagine that existing foundation models will be highly useful: language models, programming models like Codex, image models (e.g. trained on geometric illustrations, 3blue1brown videos, etc.), even video models for things like physics, might provide useful representations/priors/etc.

Doing math requires the ability to develop new abstractions that can be saved and reused.

It is a real decision-making problem: it requires meta-reasoning (reasoning about what to spend time thinking about) and curiosity. It may not be obvious how to prove a hard theorem, or what kind of structures will be useful, so really ambitious math AIs will need to do a lot of fairly open-ended searching for 'interesting' abstractions, just as human mathematicians do today.

What are the technical questions to explore?

  • Architecture. A math AI can't be a pure transformer; it needs to have some kind of knowledge base, a large context of abstract knowledge, coinciding with a working memory or scratchpad analogous to a mathematician's notebook.

  • Supervision. Is it sufficient to just train an augmented transformer model on the corpus of math papers? Or do we need to generate data or environments by a more formal process? Since most learning is by demonstration, will we need to give it access to the informal reasoning of mathematicians (which argues against fully formalized data generation)?

  • In the absence of a formal system, this is not an object-level RL problem, but it might be an RL-on-text problem. That is, we require long term generation that doesn't just mimic a previous corpus, but is in service of a goal. (and achieving this goal can itself require hierarchical planning).

  • Next steps:

    • look at some of the actual extant systems? What is the n2formal team doing? Deepmind/OpenAI? Are there competitions in the field? Is anyone in NLP doing this stuff purely with text?

What are the applications of good math AI? Would it be limited to pure math, or would it be able to help out in other branches of science, or in industry? What's an example of the win? If you were trying to get R&D funding from science funders or from VCs, what would your proposal be?

Q: How does a math AI relate to a coding AI? It seems like by the Curry-Howard correspondence (and just the observation that math and coding skills are quite related in people) they should end up being quite similar systems. A1: A coding AI might be even easier to develop since code is inherently formal so you get to skip the first-formalize-all-of-math step. A2: Coding AI has incredibly obvious uses as an accelerator for all software development. In that sense the applications are almost infinite. A3: The best bet for getting general reasoning capabilities to emerge is probably to co-train a model to code and do math.


One of my reservations about automating math (even if it is pure, cool, interesting, and doable) is that I don't have a good sense of how it's useful. Lots of budding mathematicians go through a crisis of meaning when they realize that most pure math has no impact outside of math. Many come to terms with it --- they think of what they do as art, spirituality ("knowing the mind of God"), or hope that it will eventually have uses they don't currently foresee (as number theory led to cryptography). I don't think that's my path. I want to do stuff that matters. And it's instrumentally valuable to do stuff that people will pay for, so you can make money and recruit good people.

So I want to think about math that matters. And here I think there are actually good answers! Theoretical CS, physics, perhaps economics, statistics. Basically all of modeling and probabilistic programming.

  • Even programming itself. A lot of programming in technical domains (graphics, robotics, statistics, control systems, etc) involves derivation. A basic mathematical concept (the EM algorithm, raytracing, collision detection, etc) may need to be specialized to the current situation (data structures, assumptions that can be made, etc).
  • Even if it's not new code. Maybe there are systems (in aerospace, finance, industrial automation, etc) that need to be ported to a new language, or have features added, made more flexible, etc. Sometimes you can do this with a mechanical translation of the instructions, but sometimes you really need to understand what the system is doing. And that can be a real-world understanding, but also math.
  • Algorithmic development often rests on theoretical guarantees. If we want better optimizers, planners, inference algorithms, those will need to be invented and proved correct (or consistent, rates of convergence, etc). Similarly for privacy schemes, security protocols, distributed consensus.
  • Formal verification of programs is "boring" math but obviously very useful in many domains.
  • I don't know applied physics/chem/econ as well, but I imagine those fields involve a lot of math-y work. Some of it might be explicit theorem-proving. Some might be derivation, computation. There is a market for making this stuff better. It might be the Mathematica market.