From @visakanv on Twitter: (relevant to [ nothing matters ])
A lesson from NameRedacted: unresolved questions are the worst thing in meditation. For example, you're just sitting down to practice when…
Considering a bilevel optimization problem (or saddle point problem) on the two-argument function , in general it holds that That is, the…
Short descriptions of things, when they exist, must capture some kind of structure. The principle of [ Occam's razor ] posits that we should…
Mirror descent is a framework for optimization algorithms: many algorithms can be framed as mirror descent, and proofs about mirror descent…
What pieces of [ mirror descent ] can we automate? See also [ natural gradient implementations ] Given a mirror function , we can compute…
(originally from 2020-04-29) On another note, last night I tried to dictate (on Otter) my sense of my life goals. I came up with a very…
[ Otter notes ]: Can I explain what a mixed effects model is from a graphical model standpoint? On the inference side, I think it's just…
A mixture-of-experts model consists of a set of functions , the 'experts', and a gating function that determines how to select which…
I have a [ strong opinion weakly held ] that doesn't seem to be wildly shared in the [ approximate Bayesian inference ] community: reverse…
Original paper: Finn, Abbeel, and Levine, ICML 2017, https://arxiv.org/abs/1703.03400 An approach for [ meta learning ] that works with any…
Often we don't explicitly use 'model-based RL' methods, instead people in robotics talk about Sim2Real: adapting a policy pretrained in a…
Stack: goal: sample from conformations of arbitrary hydrocarbons (or whatever). simpler goal: sample from conformations of ethane. simpler…
Naively you might think that the government just decides how many dollars there should be, and that's that. This is not true. Since [ IOUs…
A monoamine oxidase (MAO) is an enzyme that breaks down mono-[ amine ] neurotransmitters such as [ dopamine ], [ serotonin…
A very natural form of [ meta-reasoning ] that selects the most promising computations. The simplest form of 'expanding' a node assumes a…
There is a connection between moral realism and belief in [ qualia ]. If you see "experience" ([ awareness ]) as a real, fundamental aspect…
I don't hold the moral view that it's better to be a morning person than an evening person. Having always tended towards a later sleep…
In any human-to-human interaction, language carries some very important high-order bits, but it can only carry a few bits. It can help…
This is one of the big problems with the world. Not the only one, and not the only way to look at it. But it's everywhere. status : a…
(see David Graeber https://www.strike.coop/bullshit-jobs/ ) Most work is oriented towards achieving [ instrumental goal ]s. But most…
maneuvering: the bike goes where I look. look around the turn I want to do. keep elbows up. shift body weight to counterbalance the bike. E…
possible refs: google's multimodal architectures: https://webcache.googleusercontent.com/search?q=cache:https://towardsdatascience.com…
From a conversation I had about [ attention ] mechanisms in deep architectures. Maybe that terminology is too suggestive --- it's just a…
We say that a random vector is multivariate Gaussian with mean and covariance matrix if it can be written where is a vector if i.i.d…
[ thoughts on multivariate causalimpact ]
This was originally a section of breakup.org, written several years ago. this is more related to jobs and identity, but for cases when I get…
I want to intentionally spend my time well. I remember back in grad school I would spend evenings reading papers, just as a form of growth…
I've identified as a 'tech' person, but I now feel uncomfortable in many tech circles. What is tech and what does it mean to be a tech…
It's a useful exercise to occasionally reflect on what I value. stab 1: Generally pro tech, creating new things, non-zero-sum contributions…
Recommended by Michael Edward Johnson:
A 'natural' abstraction is one that we expect any agent (or at least, a wide range of agents) to develop because it gets at something…
We don't typically think of it this way, but you can derive a [ gradient descent ] step as finding the point that minimizes a linearized…
How can we automate [ natural gradient ]? See also [ mirror descent implementations ]
Cool trick: some applications can improve on nearest-neighbor lookup by training 'Exemplar SVM's. Instead of matching against a set of…
My position (a [ strong opinion weakly held ]) is that global utility is currently negative, and probably always has been. It's conceivable…
A negligible function is a function such that, for any positive integer there exists an integer such that for all , i.e., that…
Christian Naesseth, Fredrik Lindsten, Thomas Schon (2015): http://proceedings.mlr.press/v37/naesseth15.html The main idea: In an SMC…
Like the proverbial half-full glass, smart people can look at the same reality of the current capacities of neural nets, and come to…
Sometimes you'll see people say that neural nets 'just' memorize and interpolate their training data. No one denies that neural nets with…
Parts of a neuron: dendrites: these branch out to receive connections from other cells axons: these branch out to send signals to other…
The folklore no-free-lunch 'theorem' in machine learning says that, for any pair of learning algorithms, there exists some dataset on which…
No-self is one of the [ three characteristics ] that traditional Buddhism holds are present in all phenomena. In later Buddhism, the…
https://arxiv.org/abs/1712.02390 Basic idea: optimizers like Adam and RMSProp already keep track of posterior curvature estimates. These are…
Instead of directly targeting a specific rate of inflation, a [ central bank ] may target a fixed rate of nominal GDP growth, which is equal…
One way to model real-world [ causality ] is a bunch of forces working with and against each other. In this view, no individual force…
NFTs 101: https://medium.com/@intenex/nfts-101-why-nfts-are-a-generational-innovation-4626ae803e3b Among many other things, NFTs are…
Obligatory disclaimer: there will never be a drug to turn you into Einstein. Most of effective high-level thinking lies in 'software…
References: Gu et al., Continuous Deep Q-Learning with Model-based Acceleration (2016). Instead of modeling directly, we build a network…
Something can be true but not 'true enough'. That is, you have a compelling causal theory for why X should increase Y. It might be that the…
I've started reading The Art of Doing Science and Engineering by Richard Hamming. History of computing: Analog computing goes back forever…
Because: [ goals are arbitrary ]: achieving a goal, or failing to, doesn't really matter because the goal was arbitrary anyway. From the…
There's a spiritual idea, in Buddhism and elsewhere, that there is "nothing to do": everything is already suffused with "primordial…
Don't invert that matrix: https://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/ Seven sins of numerical linear algebra…
A very incomplete and maybe nonsensical intuition I want to explore. Classically, people talk about very simple [ reward ] functions like…
A few (relatively uninformed) thoughts about on- vs off-policy [ reinforcement learning ]. Advantages of on-policy learning: On-policy…
Original: Daily reflections What am I grateful for today?:: Some goals : Goals for the next ~year:: Goals for the next ~month:: Goals for…
The brain doesn't have separate models of each of the [ sense gate ]s (and thought). Instead it just stores each moment of perception as a…
Informally, a function is a one-way function if it is easy to compute but hard to invert. Or more generally, hard to pseudo-invert, i.e…
These are things that I might plausibly decide I want to work on when I sit down on the weekend. Expanding nodes on this graph. Blogging…
How do we maintain values when our models of the world shift? If someone's goal in life is to "do God's will", and then they come to believe…
As Josh Marshall said , at the beginning of the Trump presidency: "Optimism is not primarily a prediction but an ethic, a philosophy, a way…
If is a [ martingale ] and is a [ stopping time ], then any of the following conditions implies that : The stopping time is bounded…
Ken McLeod claims that 'emotional reactivity' is the origin of suffering. Pain consists both in what happens and in our reaction to it. But…
mnemonic: OIL RIG = 'oxidation is losing (electrons), reduction is gaining (electrons)' in contrast to [ acid-base chemistry ], which is…
This is how [ mitochondria ] produce most of their [ ATP ]. Mitochondria have an outer membrane and an inner membrane, so there are two…
Look again at that dot. That's here. That's home. That's us. On it everyone you love, everyone you know, everyone you ever heard of, every…
Basic notes from https://www.stats.ox.ac.uk/~doucet/andrieu_doucet_holenstein_PMCMC.pdf Setup: we have parameters and time series model…
Chocolate tasting: buy a bunch of high-end, single-origin chocolate bars. Parcel them out blind. Give people a pad to take notes on what…
We often see optimization problems with objectives of the form where is the main function of interest (e.g., training loss in machine…
“Remember that a person’s name is to that person the sweetest and most important sound in any language.” Dale Carnegie (How to Win Friends…
When you're thinking about doing something that feels right to you, it's easy to get caught up in worrying about what other people will…
reading the perceiver papers from Deepmind: Perceiver: Jaegle et al 2021 https://arxiv.org/abs/2103.03206 Perceiver-IO: Jaegle et al 202…
In the [ 5-MeO-DMT ] trip where I experienced [ ego death ], I saw a [ magical display ] of beautiful colors and flowing motion and…
The AI Effect refers to the widely-recognized phenomenon that 'once we know how to do it, it's not AI'. For example, playing chess well…
I always found it weird that philosophy spends so much time talking about specific historical philosophers. Who cares what Aristotle, or…
When considering one's impact on the world, it's important (? or at least tempting) to think about about your value-over-replacement. If you…
(see also: [ large models ]) There's a viewpoint that neural nets just memorize the training data, so the more training data you have, the…
Developed and widely used in Russia, phenibut is an analogue of [ GABA ] with a phenyl ring substituted at the carbon, giving it the name…
Why Nature Chose Phosphates (science.org)
The paradoxical thing about pointing-out style meditation teaching is that you can't really explain the instructions when they're unclear…
(see also my [ deep RL notes ] from John Schulman's class several years ago, which cover much of the same material) We can approach…
There are a few ways to do this. Google's PaLM uses rotary embeddings so it seems like that's probably close to the state of the art? But…
Different experimental conditions may give rise to different outcomes . For example, let the variable indicate whether a person is…
Prayer is a form of [ therapy ]. It's about clarifying your values: figuring out what you really want so that you can ask God for it. and…
A [ stochastic process ] is predictable if its value at time is fully determined by information available at time . Any fully…
A really valuable exercise that I should consider building into my routine is to regularly try to make and write down explicit predictions…
Consider an agent that is purely concerned with [ predictive processing ]: finding the optimal [ compression ], or equivalently the optimal…
The theory of predictive processing seems to be attracting a lot of interest in neuroscience and [ meditation ] circles. I want to try to…
https://www.quora.com/What-is-a-preference-cascade A lot of how people act is driven by how they think they're 'supposed' to act. There's…
AI / RL Distributional RL book: https://www.distributional-rl.org/ Alignment Sequences: Value learning: https://www.alignmentforum.org/s…
A Bayesian view of (one aspect of) [ attention ] inspired by a conversation with Shamil Chandaria on [ predictive processing ]. (but this…
It seems like there is, or can be, a virtuous relationship between privacy and generalization. You don't want to memorize too many…
I have some discomfort with the political concept of 'privilege', e.g.: Being white is a privilege. Being male is a privilege. Being…
(I got this concept from SuccessfulFriend.) As people grow up and form their identities, they need models, and not just models; they need…
Can we think about [ generative flow network ]s as a potentially tractable formulation of probabilistic program induction?! executing a line…
Many [ probabilistic programming ] researchers frame their work as part of the broader problem of [ artificial intelligence ]. Artificial…
A short note on interpreting a transformer layer as performing maximum-likelihood inference in a Gaussian mixture model: https://arxiv.org…
Matt Levine explains how a financier might react to losing a billion dollars: Sure sure the risks didn’t work out but you probably have a…
A probability space consists of: A set of outcomes aka possible worlds; these represent all the ways the world might be. This is the…
(aka, why frequentists will always make more money) In the "real" (corporate/governmental) world, most high-level decision making is…
Introduced by Geoff Hinton (1999): Products of Experts . Each expert produces a probability distribution. These are combined by…
The idea of 'projection' in psychology means to assume that someone else has the same flaws, or foibles, or motivations as you do. It struck…
So the mechanism is if you have tokens you can choose to stake them. And in order to run anetwork node you must stake some number of tokens…
The policy gradient theorem says that For simplicity we'll assume a fixed initial state and fixed-length finite trajectories, but the…
References: Tegmark and Omohundro, Provably safe systems: the only path to controllable AGI (2023). https://arxiv.org/abs/2309.01933 they…
Proximal methods in optimization The proximal operator of a [ convex ] function is defined as the minimizer of plus a distance penalty…
references: paper: https://arxiv.org/abs/1707.06347 great blog post on implementation details: https://iclr-blog-track.github.io/2022/0…
[ 5-MeO-DMT ] [ mescaline ] [ psilocybin ]
Toilet: A bidet. Cold water, warm-water (if a hose from your toilet can reach your sink's plumbing), or internally heated. It saves toilet…
It's tempting to use [ natural gradient ] ascent to optimize a variational distribution. We could also consider using it to optimize the…
A portfolio containing a long (European) call and short (European) put [ option ] with the same strike price and expiry date is equivalent…
A five-sided carbon ring with one nitrogen: C4H4NH.
General procedure for setting up a new Python project. Create a new git repo and clone into a directory my_new_project Add files…
"Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin." - John von Neumann Young man, in…
Formally, a random variable is a (measurable) function defined on outcomes from a [ probability space ] . That is, in any possible…
a powerful tool for establishing [ causality ]
The rate equation or master equation for a continuous-time Markov [ stochastic process ] describes how the probability density of the…
From a [ utilitarian ] perspective, all of morality follows from improving global utility, and it follows that it'd be better to do this…
In no particular order. Items may move to [ previously read ] if I read them or former reading inbox if I decide I'm not currently…
One model you could have of reading a book is that the book contains information, and once you've read it, you now possess that information…
Why do I want to write more? Because: writing forces thoughts to crystallize. It forces me to draw conclusions about what I believe and who…
[ Nielsen's notes on ASI xrisk ] introduced the thought experiment: If you ask an all-knowing oracle a question like "Can you give me a…
See also [ family recipes ]. Roast chicken and vegetables: preheat oven to ~400. cover a spatchcocked chicken with salted garlic butter at…
The best way to recruit people is to convince them that they will learn and grow by working with your team. Pitches that have 'worked' for…
Note : see [ reinforcement learning notation ] for a guide to the notation I'm attempting to use through my RL notes. Three paradigmatic…
https://andyljones.com/posts/rl-debugging.html https://www.reddit.com/r/reinforcementlearning/comments/9sh77q/what_are_your_best_tips_for…
see: [ steering language models ], [ direct preference optimization ] We are given a bunch of pairwise preference evaluations, of the form…
There tends to be a lot going on in RL algorithms, with a whole mess of different quantities defined across timesteps. It's useful to try to…
[ relationship advice ]
see also (maybe combine with?) [ relationship ] Accept [ bids ] as much as possible. Praise your partner in public (and in private). Stay in…
Suppose we want a [ transformer ] to evaluate the inequality returning if and otherwise. For integer , this can be done with a…
The selection operation y = where(c, a, b) returns How can a [ transformer ] layer implement this operation? One approach is to is to use…
When I was younger---in college or in grad school---I was sometimes conflicted about whether I should prioritize trying to get to correct…
If a model with data has normalizing constant , then the replica trick says that This allows us to analyze the average log-normalizer…
In modern ML, representation learning is the art of trying to find useful abstractions, embodied as encoding networks. We can learn…
To be a successful researcher it's incredibly important to find and join your [ research community ]. Go to conferences (especially to small…
This note lists some ideas and directions for research I'm interested in or excited about. Some are more fleshed out than others, some more…
(see also: [ impact ]) I've been feeling depressed partly because the actual PhD research I did was (in my view) pointless, and more broadly…
People who do research have a very ground-level, zoomed-in view of their field. They know where the current obstacles are, how incredibly…
Reservoir samplers solve the following task: sample items without replacement from a stream of unknown length . Because the length is…
Teachers or centers I'd be interested to do a retreat with/at: Tucker Peck Michael Taft Tina Rasmussen (Cloud Mountain 13-day retreats…
References: The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" https://arxiv.org/abs/2309.12288 Studying Large Language…
References: Ludwig Winkler's post on Reverse time stochastic differential equations . Suppose we have a [ stochastic differential equation…
stray thoughts about reward functions (probably related to the [ agent ] abstraction and the [ intentional stance ]) one can make a…
When thinking about the [ reward ] function for a real-world AI system, there is always some causal process that determines reward. For…
Silver, Singh, Precup, and Sutton argue that Reward is enough : maximizing a reward signal implies, on its own, a very broad range of…
Suppose we have a [ Markov decision process ] in which we get reward only at the very end of a long trajectory. Until that point, we have no…
See also: [ cooperative inverse reinforcement learning ], [ love is value alignment ]
Things that might be useful to log in a [ reinforcement learning ] algorithm: Return of each trajectory. (summarize as mean/std/min/max…
Implement MuZero or something similar. What are the 'state of the art' RL algorithms? What is known and not known about [ value alignment ]?
Suppose we want to maximize reward, but we only get a couple bits of reward data every few hundreds/thousands of actions, whereas we get…
Deriving here just for my own edification. At each timestep a rocket ejects mass at velocity relative to its current reference frame. At…
About 5% of people are gay, so in any given community it's about twenty times harder for a gay person to find a partner than for a straight…