In the spirit of [ prediction as a model-building exercise ]. Language modeling: system writes publishable poetry: debatably already…
Modified: April 07, 2022.
Most 'AI' in business domains doesn't have any intellectual relationship to AGI research. In the vast majority of problem spaces, what you…
Modified: March 02, 2022.
Doing [ math ] seems like a really promising area for AI. And by 'math' I mean math research (not arithmetic, which computers are already…
Modified: April 26, 2022.
As of April 2021: Giant [ transformer ]s work better than anyone has a right to expect. GPT3 is fucking amazing. [ DALL-E ] clearly has some…
Modified: April 23, 2021.
AI safety, as a term, is sterile and hard to get excited about. Preventing catastrophe is important, but doesn't motivate me, since [ the…
Modified: January 24, 2022.
Modified: February 13, 2023.
References: Dayan, Hinton, Neal, Zemel (1994) https://www.cs.toronto.edu/~hinton/absps/helmholtz.pdf This paper is one of the first to…
Modified: March 24, 2024.
Minerva : they basically fine-tuned a language model on ArXiv papers and web pages containing LaTeX, so that it can produce latex. This is…
Modified: August 03, 2022.
On the Robot Brains podcast , Andrej Karpathy explained to Pieter Abbeel why he thinks Tesla has the right approach to self-driving. Tesla…
Modified: January 24, 2022.
Abstraction is lossy [ compression ]. A good abstraction throws away everything not relevant to a particular problem, while preserving a…
Modified: February 20, 2022.
There is a similarity in kind between the negative effects of: rape slavery feeling forced to work on someone else's projects, by a boss…
Modified: March 02, 2022.
see also [ agency ]
Modified: February 25, 2022.
see also [ agent ] agency has two parts: [ yin and yang ]: yang agency: take steps. break apart a goal into subgoals until one is tractable…
Modified: April 13, 2025.
Intelligence is what makes humans special. PhdAdvisor defines it as the ability to make high-quality decisions. I think we can distinguish…
Modified: March 02, 2022.
One of the best ideas in machine learning. (I even thought so in 2011!) There are two common mechanisms: 'soft' and 'hard'. In both cases…
Modified: January 24, 2022.
http://www.incompleteideas.net/IncIdeas/BitterLesson.html The bitter lesson is based on the historical observations that 1) AI researchers…
Modified: July 25, 2023.
In the discourse around [ AI safety ] you sometimes see the claim that research on AI capabilities is harmful to the extent that it outpaces…
Modified: February 26, 2022.
Arguably the core insight of deep learning / [ differentiable program ]ming is that the shape and structure of the computations we do are so…
Modified: September 13, 2022.
How do you start building and selling [ computational therapy ]? It can't just be a medical product, because that's a hugely regulated and…
Modified: February 10, 2022.
See also: [ computational life coach ] A recurring dream I have is to use AI to solve mental health. It is simultaneously one of the most…
Modified: May 16, 2022.
Philosophical views on consciousness: Buddhist and meditative traditions focus on [ awareness ]. They claim that consciousness has nothing…
Modified: July 14, 2023.
Okay so there’s a lot of research on what conversations are, what the goals are (of course I don’t know most of this research…). It seems as…
Modified: February 13, 2022.
paper: Chen, Lu, et al. 2021, https://arxiv.org/abs/2106.01345 Trajectories are represented as sequences: where is the return-to-go, i.e…
Modified: April 15, 2022.
Notes from John Schulman's Berkeley course on deep [ reinforcement learning ], Spring 2016. Value vs Policy-based learning Value-based…
Modified: February 22, 2022.
Maybe a stupid idea, but I wonder if the idea behind differentiable physics simulators (like Brax) can be extended more broadly to rich…
Modified: July 04, 2022.
References: Direct Preference Optimization: Your Language Model is Secretly a Reward Model This seems like a compelling reframing of…
Modified: May 31, 2023.
Notes on Abram Demski and Scott Garrabrant's sequence on Embedded Agency Embedded Agents : Classic models of rational [ agency ], such as…
Modified: April 07, 2023.
A consequence of [ phase transition ]s in [ large models ] is that models may end up having capabilities we didn't expect. For example…
Modified: April 07, 2022.
(note: this is dancing around the issues around why I think [ probabilistic programming is not AI research ], even if it will be a…
Modified: February 10, 2022.
On an evolutionary timescale, it's useful to evolve structures that can learn quickly. The nervous system is an evolved organ system for…
Modified: October 27, 2022.
Is there such a thing as 'general intelligence'? What capabilities does it require? Is it a goal worth striving for? We usually speak about…
Modified: August 30, 2023.
If you need to open a specific lock, you can use a key that encodes the precise information needed to open that lock. If you need to open…
Modified: September 01, 2023.
Sutton and Barto use this as a general term for any form of interleaving policy evaluation steps with policy improvement steps. This…
Modified: March 22, 2022.
An intelligent [ agent ] should work to understand the world. This understanding takes the form of a set of relevant [ abstraction ]s, a…
Modified: May 22, 2021.
A nice observation from Percy Liang on the relationship between language modeling and grounded understanding: Just because you don't…
Modified: April 29, 2022.
The plans I make now include components that would have been impossible for me to conceive of as a kid. At the moment (July 2020), I'm…
Modified: July 08, 2020.
In the 21st century, humanity is developing a new form of engineering. Rather than manually designing artifacts, we are optimizing over…
Modified: February 25, 2022.
To achieve final goals, we have to break them down into a hierarchy of instrumental goals, and then get to work on achieving those. And for…
Modified: February 25, 2022.
Boaz Barak writes in GPT as an "Intelligence Forklift." that [ language model ]s seem to function effectively as [ tool AI ] that can…
Modified: September 29, 2023.
A lot of discussion around [ artificial intelligence ] implicitly conflates intelligence with [ consciousness ]. It assumes that as we…
Modified: January 18, 2023.
Taco Cohen speculates on Large Control Policies as a successor to large language models: https://twitter.com/TacoCohen/status…
Modified: April 15, 2022.
With varying degrees of clarity and certainty. We are [ embedded agent ]s. So are any AI systems we build. We exist inside the world; the…
Modified: December 30, 2024.
What does it mean to [ love ] someone? Of course this question has as many answers as there are people, and probably more. But here's one…
Modified: November 28, 2023.
What does it mean to love someone? Of course this question has as many answers as there are people, and probably more. But here's one view…
Modified: November 28, 2023.
An idea I got from [ John Higgs ]'s discussion of metamodernism is that taking [ all models are wrong ] to its logical conclusion requires…
Modified: January 06, 2023.
References: Risks from Learned Optimization in Advanced Machine Learning Systems A [ reinforcement learning ] algorithm attempts to find the…
Modified: March 28, 2023.
Stuart Russell told the story of giving a talk on meta-reasoning at Stanford, with Don Knuth in the audience, where he opened with a slide…
Modified: October 17, 2022.
A very natural form of [ meta-reasoning ] that selects the most promising computations. The simplest form of 'expanding' a node assumes a…
Modified: March 22, 2022.
From a conversation I had about [ attention ] mechanisms in deep architectures. Maybe that terminology is too suggestive --- it's just a…
Modified: March 03, 2024.
A very incomplete and maybe nonsensical intuition I want to explore. Classically, people talk about very simple [ reward ] functions like…
Modified: March 31, 2023.
How do we maintain values when our models of the world shift? If someone's goal in life is to "do God's will", and then they come to believe…
Modified: April 12, 2023.
reading the perceiver papers from Deepmind: Perceiver: Jaegle et al 2021 https://arxiv.org/abs/2103.03206 Perceiver-IO: Jaegle et al 202…
Modified: September 25, 2023.
The AI Effect refers to the widely-recognized phenomenon that 'once we know how to do it, it's not AI'. For example, playing chess well…
Modified: May 29, 2020.
There are a few ways to do this. Google's PaLM uses rotary embeddings so it seems like that's probably close to the state of the art? But…
Modified: September 28, 2023.
Consider an agent that is purely concerned with [ predictive processing ]: finding the optimal [ compression ], or equivalently the optimal…
Modified: April 12, 2023.
A Bayesian view of (one aspect of) [ attention ] inspired by a conversation with Shamil Chandaria on [ predictive processing ]. (but this…
Modified: May 25, 2023.
Can we think about [ generative flow network ]s as a potentially tractable formulation of probabilistic program induction?! executing a line…
Modified: March 14, 2022.
Note : see [ reinforcement learning notation ] for a guide to the notation I'm attempting to use through my RL notes. Three paradigmatic…
Modified: April 23, 2022.
People who do research have a very ground-level, zoomed-in view of their field. They know where the current obstacles are, how incredibly…
Modified: January 16, 2021.
Silver, Singh, Precup, and Sutton argue that Reward is enough : maximizing a reward signal implies, on its own, a very broad range of…
Modified: March 02, 2022.
stray thoughts about reward functions (probably related to the [ agent ] abstraction and the [ intentional stance ]) one can make a…
Modified: April 06, 2023.
References: https://generative.ink/posts/simulators/ It seems pretty clear that the intelligence emerging from [ language model ]s is not…
Modified: February 16, 2023.
Getting language models to align their output with human preferences would be highly useful for [ computational life coach ]ing. What's the…
Modified: July 18, 2021.
tl;dr : the ideas we need to build intelligent systems may be different from those we need to understand them. Both are important, but…
Modified: February 26, 2022.
The [ agent ] model of intelligence imposes a sharp distinction between the agent and its environment, where the agent 'chooses' actions…
Modified: June 27, 2021.
Sometimes mentioned as a potential approach to [ AI safety ]. Gwern: Why Tool AIs want to be Agent AIs (roughly: because treating…
Modified: April 07, 2022.
These days we think a lot about using data to train large [ language model ]s. But there's only so much data in the world; eventually we'll…
Modified: October 27, 2022.
Incorporating explicit memory and retrieval seems pretty clearly like the next frontier in language modeling and AI more broadly. We have…
Modified: September 03, 2022.
The core of the transformer architecture is multi-headed [ attention ]. The transformer block consists of a multi-headed attention layer…
Modified: February 13, 2023.
I don't know quite how to articulate or formalize this, but I get a sense that there is something fundamentally analogue, 'periodic' or…
Modified: March 19, 2024.
This may be a central point of confusion: how do we define AI systems that have preferences about the real world , so that their goals and…
Modified: April 12, 2023.
The models we use in AI are [ all models are wrong|wrong ] (if maybe still useful). How? Agency The [ agent ] model assumes a separation of…
Modified: February 13, 2022.
This page is a general jumping-off point for organizing my thoughts about the [ AI research landscape ], where the field is, where it is…
Modified: November 27, 2023.