wisdom I've acquired: Nonlinear Function
Created: January 01, 2017
Modified: February 15, 2022

wisdom I've acquired

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

From 2017: wisdom I've acquired:

  • the psychology of depression. :-( and grad school. :-( and being gay.
  • dual-process cognition theory: the importance of "aligning" your identity between your immediate fast reflexes, instincts, and verbal patterns, and your higher-level goals, values, and plans. If you can't trust your instincts, you have no confidence: you need to think through everything you say, you question every move, and you can't commit yourself to difficult or long-term endeavors (research projects, jobs, conversations, relationships) because you don't believe in your future self's ability to make the right decisions in the moment.
  • this is related to LGBT identity in the sense of gay pride and being "out". It is a huge burden to constantly filter all your language, actions, and relationships to "perform" a different self than the one you really are. It's much more freeing to know that you can be your "authentic self", to be able to trust your instincts, to allow system 1 to work and allow your system 2 to develop confidence.
  • this also feeds into notions of depression as an extreme loss of confidence. "learned helplessness" can come from trying things and having them fail (like a PhD research career, or a relationship), but it's also realted to a mismatch between systems 1 and 2, where you don't trust system 1 to properly execute system 2's goals. both grad school and LGBT are serious risk factors for depression. I think there really is a common thread here.
  • buddhism, "life is suffering", the importance of joy but also the necessity of sadness and suffering and the ability to step back and create distance from it. the impossibility of immortality due to the fluidness of identity. death as a zero-utility state, not to be feared. the value of meditation, of understanding your own mind (relating back to an "aligned" identity). don't crave anything. kabbalah-style, there is a crack in everything: means, the world is fallen and broken, but also every situation can be fixed, there is always light in the darkness, the things we thought we craved will not make us happy, and losing the things we thought we craved will not prevent happiness.
  • the value of traditional religion in creating purpose, community,
    • shared myths (Harari-style).
  • general patterns of productivity and research advice.
    • keeping notes, writing down thoughts
    • research as a social/communal process. the point of a paper is to convince and impress actual people
    • the importance of sharing ideas with people, early, to get feedback, develop the habit of framing those ideas coherently in clear, compelling language, and build relationships to improve those ideas
    • the value of working on important problems (PhdAdvisor-style), and the knowledge that a lot of published research is useless, no matter how impressive, so don't be intimidated.
    • the importance of ownership of research projects and agendas. for something as abstract and ill-defined as research, the motivation has to come from within --- it can't just be someone else's project. tenenbaum's "don't work on research you don't love".
      • in addition to motivation, this gets into identity. a good researcher genuinely cares about the long-term goal they are working towards, they daydream about it, they tie it into other parts of their life: they read other papers and talks even non-scientific books and articles, even just experiences in their everyday life, partly as a source of inspiration for new approaches, problems, applications. you can't do this and stay mentally healthy (in the sense of an "aligned" identity: where your system 1 and 2, or all your various subsystems and reflexes, are working together in a harmonious way) unless you are deeply invested in the work you're doing.
    • habits of research code and experimentation. I'm still bad at this but I've gradually learned useful reflexes:
      • save all intermediate results into files
      • experiments should be reproducible with a single command. including code to generate and preprocess data
      • algorithmic/research code should be short, <50 lines, easily fit in (a human being's) working memory
      • keep experimental cycles fast. test ideas on toy examples or small datasets before running large/complex experiments.
      • don't waste time watching experiments run.
      • time spent writing visualization or exploratory code is usually time well spent.
      • come up with a long-term goal and break it down into subgoals. given concrete subtasks, you can usually do things much faster than you think. a task that seems daunting might actually take just a few hours once you think concretely about how to approach it.
      • write up work as you do it. keep latex notes with equations and explanations. this forces you to think through ideas, to keep a record of your work, and makes it a lot easier to write the eventual paper.
    • writing research papers: a paper should tell a story. it should
      • clearly describe a problem (a goal that research should work towards), explain previous work in this direction and how it is lacking, and then present what you did and how it addresses the goal. it should be written fractally: the abstract should contain the entire paper, the intro an expanded version, the paper itself an expanded version, and push complexity into the appendix. the details that seem important to you (of what you actually did) are going to be skipped over by the vast majority of readers. you should include them, and do it carefully, but it's important that the casual reader of the paper feels like they learned something, like you showed them a new idea, that will go on to inspire their thought. Most papers are read deeply by only a few grad students replicating work, and the details are valuable for replication but ultimately only the bigger ideas become incorporated into the story of science.
      • of course which story to tell is a social thing. many ideas can be sold in different ways. communities will discount ideas that might actually be good, because they haven't been framed in the right way. a given idea might have many advantages; knowing which ones to concentrate on is
      • by "pushing complexity into the appendix" I also mean, start by describing the idealized version of your idea and not the fallen version you actually implemented. Yes your actual code is full of bugs and hacks and might not be quite a technically correct MCMC or EM algorithm or whatever. And you should acknowledge this, you don't want to lie to people so that they feel bad when the 'clean' version of the algorithm doesn't work for them. But you also should do them the favor of starting with the clean, ambitious high-level view, so that they understand the intellectual scaffolding around which the (fallen) implementation has been built.
    • similarly: people reading papers or watching talks are easily pleased if they feel like they learned something from the paper. they might not care as much about the novelty of your particular work, as about whether they feel like reading the paper gave them new ways of viewing things or an understanding of a set of questions or ideas that people care about. In talks people usually spend 10% of time on related work, and 90% on the new stuff. But in many cases (except for very specialized audiences) usually the audience would prefer the other balance: 90% teaching them useful things you know, and 10% what you are thinking about.
    • the value of arrogance. trying to 'keep up' with what all the top researchers are doing is trying to force yourself into their minds: you'll never be as comfortable there as they are. and a lot of what they're doing is arbitrary and wrong; trying to read optimality into it will just drive you crazy. of course you should understand the canon, the recurring high-level ideas, the curriculum, the common language of the field. but it's equally or more important to have the boldness to act, to try new things yourself, to not be intimidated by the vast literature of ongoing work. if you are solving a real problem, then what you do will usually be novel and interesting in some* way.
      • usually anything you do can be seen as a special case of, or related to, something that already exists. But if you are ambitious enough, you can also frame the other thing as a special case of what you are doing. Neither of these perspectives is inherently more true than the other.
    • the importance of groundedness. entire subfields and subareas concern themselves with solving problems of their own creation. this is SuccessfulFriend's concern about the OSDI community: they have no Pagerank, no incoming edges from the broader world. They are playing a game for its own sake. Much of pure math is like this, but at least there the abstractions are very general and beautiful. And even good pure math has edges to other areas of math, and ultimately therefore to the real world. The dark side of arrogance is that researchers can convince themselves (and students) that their work is important when ultimately, it's not. Don't get depressed about this: just recognize that it happens, and don't be intimidated when academics are doing work you don't understand, because much of it really is bad. Irrelevance is a risk of research, but what matters is that you're working on projects where you see a path to ultimate value, however you define that. Don't get drawn into someone else's project where you don't see that, just for the sake of imagined academic prestige.
    • self-confidence as a driver of research progress. whether something is "novel" or "interesting" is a matter of perspective. imposter syndrome means you feel like all the work you do is valueless. but part of the value of research, and academics in general, is contagious enthusiasm. even if the thing you did isn't great, if you can speak (and write) compellingly about the larger vision, so that people become excited about the direction you're working in, then you've made progress. Excitement is not a zero-sum game. Many people aren't excited by any research ideas at all (I've been this way while depressed), or they don't spend their time thinking about research directions. If you can create excitement where there was none before, you've done something cool and useful.
    • the importance of the "humanities" view of research. Yes math and rigor and technical precision and falsifiability are important in the search for truth. But at its highest level, research is about ideas, that exist in human minds independently of any given formalization. Probability existed before Kolmogorov, and his formalization is not the only possible one, or necessarily the best. Various ideas in machine learning (under/overfitting, model capacity, generalization, forward vs inverse problems (generative/discriminative models), latent representations, sampling vs optimization, causality, probably many others) exist independent of any particular formalization, and reoccur across formalizations. Especially for such a young field, the particular equations or theorems or even entire formal paradigms you learn in class will probably end up being obselete. But knowing many formalizations, remembering the approaches people have taken, understanding that research is a conversation and a social endeavor at least as much as adding theorems (or code) to The Book, lets you continue to provide insight as the world shifts around you.
    • also on the 'humanities' view of research: the loss of the notion of truth, or of uncritical understanding.
    • the view of research as playing an abstract game, of ideas, similar in a very broad sense to very abstract games of other types of leadership like politics or business or high art. The "moves" in these games may not be complicated, and the games not difficult in themselves. But what makes them so rarefied is that the moves ground out into many levels of finer moves until you arrive at the object level. So your ability to play the game at the high level also requires your ability to play out all the lower-level games in order to execute the moves. Directing a research program also requires the ability to write individual papers, to generate ideas, write and structure clear code, budget your time, make friends and build collaborative relationships, all the way down to the fine-grained level of being able to efficiently code or use a Unix commandline or quickly write legible English prose.
      • IMPORTATANTLY, playing the abstract game doesn't require you to be smarter or better at game playing than people playing more concrete games. It just requires you to have gotten into the position of making moves in the abstract game, so that you gain enough experience to get good at it. Anyone who wants can play chess games and learn to get good at it. Very few people get to make many business decisions, or political decisions, or high-level research decisions, so very few people get to develop the skill of playing those games. Getting into the position to do so is part merit (to the extent you are responsible for grounding out your own moves, you need to develop those ground-level skills) but also in large part luck. Much of the move-grounding is done by others (business or political subordinates, or in research, by students and later by engineers who ultimately implement and productionize your ideas), so you can get good at playing the abstract game just by being given the opportunity to play it. Children in hereditary monarchies, or heriditary business dynasties, have no special properties but they often learn the skills of leadership just because they were lucky enough to be given the opportunity.
      • ALSO IMPORTANTLY, the abstract games are not necessarily more important than the concrete games. Going up in abstraction increases power, as an abstract move influences many concrete moves, but it also adds distance: an abstract move doesn't necessarily determine concrete moves and every abstraction is leaky. Ideally you only build up the abstraction hierarchy as high as is useful. But it definitely can happen that a corporation has so many levels of management that the CEO's decisions end up ultimately disconnected from the operations of the company. Or a researcher's innovations can end up entirely disconnected from practice. Of course a good researcher will seek to have "impact", and part of playing the abstract game well is to manage the abstraction hierachy itself: a president is responsible for organizing their government, a researcher is reponsible for choosing which ideas to work on and what sort of people to collaborate with. BUT in practice many abstractions certainly end up being bad. Much research is useless, much of modern finance and corporate structure probably doesn't maximize human welfare, much of government is corrupt or captured, etc. So it's reasonable to try to do the thing well, but it's also reasonable to accept that even a good effort at abstraction could be less effective than a dedication to the concrete, where you can both learn to do a thing well (because practice/rollouts are cheaper) and be more certain the thing itself is worth doing.
    • the unique location of academia in the intersection between skills and meta-skills. You not only have to be able to do good work, but also to talk about the work, to reflect on it, to explain how you do the work and criticize others' work. Many great authors are not great critics. They have an instinct for telling a great story, for getting inside the heads of compelling characters, but they could not give you a recipe book for what they do, or any fixed set of rules to follow. They can't necessarily analyze and criticize other work at a high level of theoretical sophistication. But an academic has to do the latter, in order to give feedback to students, and more broadly in order to participate in the research conversation and explain what's lacking in other ideas and why their ideas are useful and novel and valuable. They can't just have the skill of writing a good paper, but they have to have the meta-skill of explaining how to write a good paper: to boil down their instincts to a set of guidelines and principles they can explain to students, and recognize the presence (or lack) of in other work. This requires being comfortable with formalization: there is no truly universal set of rules for great work, but critizing other work requires you to commit to at least some sort of framework for doing so. It's rare to have both the skills of a creator and a critic, a generator and a discriminator. The best researchers, the best academics, do have both. Having the skills of a critic helps you identity the flaws in your own work and improve it. But both skills are also individually useful and there's no shame in focusing on one.