All Notes: Nonlinear Function

All Notes

free lunch theorem

'[ no free lunch theorem ]' arguments are misleading because they consider the space of all possible functions. In fact, we usually care…

free will

NameRedacted points out the strong free will theorem . This says that electrons sometimes have 'choices': situations where their behavior…

friend

front-door adjustment

frustrations with meditation advice

fully automated luxury gay space communism

[ my goals ]

fun is good

Should I give up having fun in order to do impactful things? It'd be tempting to think that it'd be more virtuous to be serious and mission…

fundamental

fundamentals are useless for adults

There's often a lot of space between learning [ fundamental ]s and being able to do a thing. Understanding Turing machines didn't…

future self

gate

Examples recommended by GPT4: Long Short-Term Memory ([ LSTM ]) : Paper: "Long Short-Term Memory" by Sepp Hochreiter and Jürgen Schmidhuber…

gay communities

gay pride

Why would you be 'proud' of something you had no control over? The core revelation for me was that pride is the opposite of shame . Most…

general intelligence

Is there such a thing as 'general intelligence'? What capabilities does it require? Is it a goal worth striving for? We usually speak about…

general-purpose intelligence

general rules

Rules that work in many situations are valuable. If you can cook a burger, you're a McDonald's employee. If you can specify the rules for…

general techniques are simple

If you need to open a specific lock, you can use a key that encodes the precise information needed to open that lock. If you need to open…

generalization

Fundamentally, where does generalization come from? [ causality ]: a model may generalize because it has discovered the true mechanism, or…

generalized policy iteration

Sutton and Barto use this as a general term for any form of interleaving policy evaluation steps with policy improvement steps. This…

generative flow network

Many objects can be generated by a sequence of actions. For example: Generating language by adding one word at a time Generating a molecule…

generative questions

Small talk can be mindnumbing and pointless. I like the idea of 'big talk, not small talk'. But realistically small talk serves a social…

generative vs discriminative modeling

"What I cannot create, I do not understand". Related to: [ computational complexity ]: provers vs verifiers. [ P != NP ] [ production vs…

giving in won't help

a Buddhist point. I was trying to fast last night and kept being tempted to relax it slightly . "It won't hurt anything if I just have one…

glimpses of AI

An intelligent [ agent ] should work to understand the world. This understanding takes the form of a set of relevant [ abstraction ]s, a…

global utility

global workspace

glutamate

Chemically, glutamate is the [ acid-base chemistry|conjugate base ] of glutamic acid (an [ amino acid ]). It is an [ anion ]; its sodium…

goals are arbitrary

If you fail to achieve your goals, you'll be sad---almost self-evidently. Nonetheless: there is no coherent notion of the 'right' goals to…

goals for 2021

Visit NameRedacted Visit SuccessfulFriend Visit Asian cities Do real bulking and cutting to get in shape Move Learn Mandarin Learn…

goals for raising kids

see also [ thoughts about kids ] Get them reading young---ideally by age 2 (like NameRedacted). Learn an instrument from a young age. Let…

grace

One can never fully deserve grace, but may receive (and extend) it regardless. Probably the most beautiful conceptual contribution of…

grad school advice

Thoughts drawn from my experience doing a CS PhD at a top-4 school around 2010-2016. They may be somewhat applicable to PhDs in other areas…

grad student depression

A recent survey found that ??% of Berkeley grad students suffer from depression. This should be shocking and dismaying. Yet no one seems…

gradient clipping

Why do we clip gradients in deep learning? When is it important and what is the right way to do it? It seems like the standard recipe used…

gradient descent

gradient of the log normalizer

For a normalized distribution , constructed from an (unnormalized) energy with normalizing constant as a function of parameters , in…

grading

How should I think about grades when [ teaching ] a class? I believe in mastery learning. Feedback isn't useful unless students have the…

gradually, then suddenly

"How did you go bankrupt? Two ways. Gradually, then suddenly." - Hemingway. It can take a long time to lay the foundations for significant…

graph neural networks

A 'graph neural net' is a differentiable, parameterized function whose input or output (or both) is a graph. Discriminative: graph as input…

great movies

see also: [ great shows ] Cloud Atlas Pulp Fiction

great shows

see also: [ great movies ] Deadwood Sense8

greeks

The greek letters used most commonly in finance are probably alpha and beta from the [ single-index model ]. However, the term 'greeks…

grief is depression

When someone we love dies, a sense of possibility leaves the world. Our grief is usually proportional to how much we cared about them…

grokking

grounded

A nice observation from Percy Liang on the relationship between language modeling and grounded understanding: Just because you don't…

growing up means becoming wrong

(related: [ communication is processing ]) A big part of growing up is communicating to your future self. Your future self isn't going to…

growing up my own way

A thought that just occurred watching Billions. The two deputy attorneys (white guy and black woman ??) are in her apartment, which is nice…

growth mindset

Opposite of a fixed or 'scarcity' mindset. It's important to recognize that the world is nonzero-sum and that improvements are possible. We…

habits of intellectual conversation

these include things like: posing interesting questions for discussion useful [ generative questions ] like: what have you been reading…

habits of thought

hair transplant

At ~33 my hairline has been receding and is starting to thin in front. I have a few options: Accept hair loss and go bald. Accept hair loss…

hangover

happiness

A New Yorker article on [ happiness ]: http://www.newyorker.com/tech/elements/a-better-kind-of-happiness discusses happiness as a source…

hard attention

Closely related to [ discrete latent variable ]s and to [ reinforcement learning ] with discrete actions. If I do a thing and it goes well…

hating god

On Michael Taft's podcast, A. H. Almaas pointed out that an obstacle for most people realizing a sense of divine, nondual love, is some…

hedonic treadmill

heuristics for research taste

Pretend that some other group has published the paper you're imagining: are you excited to read it? Write down ten ideas and ask a mentor to…

hierarchical plan

high dimension

high-dimension

high-dimensional

high-level actions

The plans I make now include components that would have been impossible for me to conceive of as a kid. At the moment (July 2020), I'm…

high leverage projects

What are some things that people are doing now that are just clearly valuable? 3blue1brown : So much of math is hidden behind notation…

hold the view

[ Dan Brown ] likes to say that 'the view is the meditation'. That is, meditation isn't

how to keep up with papers

People on Reddit worry that there are hundreds of new ML papers every day---how could you possibly keep up? How can you filter the firehose…

how to make a friend

[ be open to friendship ] [ act as if you're already friends ] [ people like hearing their name ]

huddling together against the dark

campfires apres ski choir rehearsing under a crisp fall night sky ragnars in vans shabbat various gay gatherings (e.g. pride) -- here the…

ideal tiny kitchen

Breville/Polyscience Control Freak: induction burner with temperature sensor. replaces normal stove, electric kettle, maybe sous vide, maybe…

identifiable

identity

identity goals

from 2017: remember things I am at least somewhat an expert in, that other people may not have seen: philosophy of bayesian statistics…

identity is never fixed

I'm sometimes tempted to look back and find patterns in my life, and identify those as "who I really am". For example: maybe I want to be a…

if ever a prof

(originally written as a Google doc between 2010-2012) most of this advice is obvious, but still good to remember. students appreciate food…

ill-conditioned

Multiple senses: An 'ill-conditioned matrix' has a large ratio between its largest and smallest eigenvalue (more generally, see what is a…

imagination rollout

References: Gu et al., Continuous Deep Q-Learning with Model-based Acceleration (2016). A technique used in [ model-based rl ], where we…

imitation learning

immortality is bad

The Eliezer Yudkowsky school of thought is that immortality is possible, and obviously desirable; any other position is [ learned…

impact

One aspect of depression recently has been feeling like things are pointless, there's nothing valuable for me to do. My entire PhD has been…

impermanence

[ Shinzen Young ] reframes the traditional Buddhist concept of impermanence as "flow". I think this is starting to make sense to me. I…

implicit regularization

Examples: SGD prefers some minima over others

importance sampling

Importance sampling allows us to compute expectations under a distribution using samples from a different distribution , by weighting the…

important neural net phenomena

[ grokking ] / [ phase change hypothesis ] emergence of near-discrete features in large transformers symmetries / non-[ identifiable…

imposter syndrome

For me, imposter syndrome felt like knowing that there was something deeply wrong with me, something I was missing or didn't get, that would…

in silico

In the 21st century, humanity is developing a new form of engineering. Rather than manually designing artifacts, we are optimizing over…

incentives

indole

Indole alkaloid - Wikipedia Indole - Wikipedia A benzene ring fused with a [ pyrrole ring ]

inductive bias

Ways to specify inductive bias: Feature engineering Prior distribution acts as regularizer in MAP estimates Graphical model (constraint on…

inductive types

Talia Ringer says that these are one of the most beautiful, foundational ideas in programming languages: https://twitter.com/TaliaRinger…

infinite doorways

Around 15min into https://www.youtube.com/watch?v=gWAZFuz_mFc , [ Ram Dass ] claims there are 'infinite doorways'. You start on the…

infinitesimal

The Leibniz calculus notation using infinitestimal quantities like or is simultaneously Very sensible and intuitive, but also Constantly…

influence function

References: Bae et al. (2022) If Influence Functions are the Answer, Then What is the Question? https://arxiv.org/abs/2209.05364 Grosse et…

inner ring

instrumental goal

To achieve final goals, we have to break them down into a hierarchy of instrumental goals, and then get to work on achieving those. And for…

instrumental variables

The [ front-door adjustment ] allows identifying causal affects using a mediating variable that sits on the causal chain between X and Y…

intellectual friendship

I want to care about an intellectual topic and have friends and colleagues with whom I can enjoy discussing that topic. I used to have…

intelligence

[ theory of intelligence ]

intelligence forklift

Boaz Barak writes in GPT as an "Intelligence Forklift." that [ language model ]s seem to function effectively as [ tool AI ] that can…

intelligence is not consciousness

A lot of discussion around [ artificial intelligence ] implicitly conflates intelligence with [ consciousness ]. It assumes that as we…

intelligence is not moral worth

Inspired by this tweet: https://twitter.com/davmre/status/841803926051549184 Ways we currently equate intelligence with moral worth: animal…

intentional stance

interactive proof

interest rate

It's a bit counterintuitive that high interest rates prevent inflation. After all, doesn't a high interest rate mean that the [ central bank…

interesting things dennett says

interesting things dennett says in "darwin's dangerous idea" the "baldwin effect" uses reinforcement learning to construct a version of…

interface

Interfaces enable modularity. In general, standardizing an interface can yield quadratic benefit at linear cost. Suppose we have people…

interior design

Things to pay attention to in setting up a new place: Lighting makes a huge difference and can totally change the ambience of a space. Don…

intermittent fasting

Fasting is a powerful, life-changing idea because it's simple, clear, easy to follow. A diet plan that involves counting calories requires…

intermolecular force

References: https://en.wikipedia.org/wiki/Intermolecular_force and linked pages conversations with GPT-4 A molecule is a set of atoms…

intervals

The theory of musical intervals almost makes mathematical sense (but not quite): The sensible part: integer ratios The ancient Greeks…

intrinsic motivation

objectives: maximize entropy of the state visitation distribution requires empirical estimates of [ entropy ] maximize mutual information…

inverse reinforcement learning

ionic bond

ionotropic receptors

is vs ought dichotomy

David Hume pointed out that there's no logical way to get from 'is' (descriptive) statements to 'ought' (normative) statements. This is…

isoperimetric

The isoperimetric problem : among all closed curves in the plane with equal perimeter, which encloses the largest volume? It's well-known…

it's hard not to learn from experience

It can be very hard to hold onto a positive self-image and an [ optimism|optimistic ] worldview, even if you intellectually 'know' these to…

it's hard to frame questions about what you don't know

Admitting things that I personally don't know is hard, because it feels like admitting a weakness or failing. But ignorance isn't a…

it doesn't matter what you feel

Apparently NameRedacted told this story in a talk. Sharon Salzberg says that if she could put one thing on her tombstone it would be this…

it seemed profound at the time

Notes copied from Google Docs https://docs.google.com/document/d/1G7Gxo-A3gQrlUx3G4BYmH-GjUWbB7eTOJZcuAs2ii0A/edit most of these things…

jhana

job goals

From 2017: stay publicly active. work on open side projects, and/or publish, and/or blog. live at such a means that I could lose my job and…

joy is not selfish

Something I've struggled with: given my position of immense privilege, how can I justify doing wasteful 'fun' things like skiing or clothes…

karma

I don't know if this makes sense, but one intuition I have for karma comes from the observation that the weights of a least-squares linear…

kelly criterion

We are given the opportunity to bet some fraction of our wealth on a coin flip with probability . We can repeat this as many times as we…

kernel

multiple senses: in machine learning: positive definite (Mercer) kernels in linear algebra: kernel (nullspace) of a linear map in CS systems…

ketamine

Effects in the brain Ketamine is an antagonist of [ NMDA receptor ]s, and activates [ AMPA receptor ]s. References: Ketamine – Lorien…

keys and locks

pg says: the random things that you learn as a kid make you into a key. Your job is then to find the lock that you fit into. But that's not…

kids matter

Adults tend to talk to and work with other adults. We don't spend nearly enough time addressing the problems of younger people: college…

kitchen sink deep learning

I just coined the phrase 'kitchen sink' deep learning for a vague idea that comes to me occasionally. Roughly: rather than using a uniform…

language basics

An annoying thing about language-learning tutorials is that they often focus on language that you'll never actually use as a tourist. When I…

language model

language model cascade

large control policies

Taco Cohen speculates on Large Control Policies as a successor to large language models: https://twitter.com/TacoCohen/status…

large effects

Much of statistical practice is concerned with distinguishing signal from noise. For example, significance tests quantify the likelihood…

large models

If you believe that neural nets basically just memorize the training data, then training larger and larger models is hopeless. The…

leadership

I don't want to be led. I want to be creative and do things that are dramatically new. Telling other people what to do feels almost evil to…

learned helplessness

Happens all the time, at small scales and large scales. Large scale: When I was young, it was possible to imagine becoming a confident…

learning from success is better than failure

The set of good approaches is often hidden in an exponentially large space. Learning that an approach is not good doesn't help narrow that…

learning new skills

leaving academia

There's no easy or fast solution for feeling good about leaving academia, because you're giving up some aspect of your [ identity ], and…

legibility

legibility in software

(related: [ growing up means becoming wrong ], ask for evidence , [ Seeing Like A State ]) Related to some interactions with NameRedacted…

legible identity

A thing that's tough about going through big personal changes is that it takes a while for your self-model to catch up with your actual self…

let go

let me google that for you

People on the internet have very different standards as to when and how it's okay to ask a question. There are roughly two camps: Search…

liberation

library of Babel

https://maskofreason.files.wordpress.com/2011/02/the-library-of-babel-by-jorge-luis-borges.pdf Insights illustrated by this story: Naming…

life conclusion

lighthouse goal

limitations of autodiff

In principle we can apply [ automatic differentiation ] through any composition of differentiable operations. This lets us get gradients of…

limiting belief

limits on individual impact

I want to change the world. What does that mean? Suppose I create a billion-dollar company. That's an enormous amount of value. It's many…

linear attention

tags: created: 2023-12-07 modified: 2023-12-07 References: https://arxiv.org/abs/2006.16236 The usual [ transformer ] [ attention…

linear time-invariant

A linear time-invariant system is one where the dependence of the output on the input is: linear: an input produces an output , and…

lipid bilayer

listening inbox

podcasts and audiobooks Alan Watts 'out of your mind' lectures

live the life you want to have

I've heard it said, and it's been ringing true to me, that the thing to do is live the life you want to have now , not plan to spend years…

living on autopilot

Some things are genuinely hard to do. But many others I don't do just out of laziness, or maybe lack of [ agency ]. I know that they're…

lone pair

A atomic orbital with two electrons both attached to the same atom. In contrast to a bond, where each atom contributes one electron. Lone…

long-term context in Transformers

Notes on https://www.pragmatic.ml/a-survey-of-methods-for-incorporating-long-term-context/ 'Standard' transformers have O(n**2) complexity…

looking under the lamppost

There's a tendency to focus on things that we have the (conceptual/mathematical/societal) tools to understand, even when we know this is…

loose veganism

How do I justify being only 'mostly' [ vegetarian ]? I know that cows and chickens are abused to produce milk and eggs. Why is avoiding…

love is value alignment

What does it mean to love someone? Of course this question has as many answers as there are people, and probably more. But here's one view…

love-positive

loving-kindness

lustful curiosity

I saw this phrase on Twitter somewhere and it really resonates as a description of the ideal approach to science. There is no real…

macrostate

A macrostate in statistical mechanics is a collection of base-level states; equivalently, a subset of [ phase space ]. It's what you see…

magical display

mahamudra

Mahamudra means the 'great seal' or 'great gesture'. We take and [ hold the view ] that each event arising in [ awareness ] --- every sight…

manager

Managing for high-variance / creative work versus low-variance consistent work: https://blog.sbensu.com/posts/2023-01-18-high-variance…

managers are worst-case analyzers

There are a lot of difficult decisions to be made in life. Maybe you need to decide the business strategy of a company, knowing that good…

many models

An idea I got from [ John Higgs ]'s discussion of metamodernism is that taking [ all models are wrong ] to its logical conclusion requires…

many selves

Sometimes I've been scared of losing my identity. In particular I worry about working a non-research job, or having sex with (or being…

marijuana

I have a private theory about what marijuana does. I'll try to articulate it here. I don't know much about the public theories, so maybe…

martingale

A martingale is any [ stochastic process ] that stays the same in expectation. Formally, is a martingale if This condition is related to…

massage

How to think about giving a good massage? Know which way the muscle fibers go. For deep release, exert force perpendicular to the muscle…

math

matrix exponential

Reviewing this 3blue1brown video: https://www.youtube.com/watch?v=O85OWBJ2ayo The matrix exponential is written as E to the power of a…

matrix inversion lemma

The Woodbury-Morrison-Sherman matrix inversion lemma, is sometimes useful just for algebraic simplifications. In cases where and are…

matrix notation

Notation for Matrix Multiplication Let and . Then just by the definition of matrix multiplication (the summation over is performing…

maximal update parameterization

References: Hu, Yang (2022) Feature Learning in Infinite-Width Neural Networks https://arxiv.org/abs/2011.14522 Yang, Hu et al. (202…

maximum-entropy reinforcement learning

For any reward function and policy , consider the entropy-regularized reward Taking as our objective the (expected, discounted…

mcmc notes

Note: these are personal notes, taken as I was refreshing myself on this material. They're mostly stream of consciousness and probably not…

measurable function

A function is measurable with respect to [ sigma-algebra ]s on its domain and on its range if the pre-image of any event is…

mechanistic interpretability

meditation

The core insight that got me interested: "moments of recognizing your thoughts drifting and bringing them back to your breath" are not…

meditation following log

Dec 13, 2023 Sangha session with Dustin: Something I realized during a concentration practice is that being honest about how ‘well’ the…

meditation ideas I resist

Generally I think the dharma is deeply true and that [ meditation ] done right is healthy and potentially very beneficial. But I struggle…

meditative attainments

Specific states or abilities that can arise from skillful meditation: feeling [ equanimity ] first cessation seeing nimitta accessing…

melatonin

memory

memory efficient backprop

Suppose we want to do [ automatic differentiation ] on a [ computational graph ] of sequential length . This could equally well be a…

mental models

One last thought mental models are so, so important. When I think about computer modeling. It's actually great computers are powerful they…

meritocracy

Like democracy , meritocracy is the worst form of social organization, except for all the others that have been tried. Of course it is good…

mesa optimizer

References: Risks from Learned Optimization in Advanced Machine Learning Systems A [ reinforcement learning ] algorithm attempts to find the…

mescaline

Chemically, a substituted [ phenethylamine ]. Like [ dopamine ] but with methyl groups hanging from the two oxygens, and another oxygen…

meta learning

Generally this means training some aspect of the learning procedure itself. There is then an inner-loop learning procedure, which follows…

meta-level shape of machine learning

Unlike most modern [ deep learning ] systems, humans: don't have separate training/test phases (though we may have wake/[ sleep ]) don't…

meta-reasoning

Stuart Russell told the story of giving a talk on meta-reasoning at Stanford, with Don Knuth in the audience, where he opened with a slide…

methamphetamine

n-methyl-[ amphetamine ]