Modified: February 25, 2022
attention and utility
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.Thesis: attention is the dominant factor in our utility. It follows that work that shifts our attention can genuinely improve global utility, as much or more than work that changes base-level reality.
Let's first acknowledge that there are good and bad base-level experiences in life: there is hunger, boredom, pain, loneliness, etc., and there are moments of joy, community, discovery, satisfaction, etc. It's not unreasonable to argue that there are 'more' base-level bad experiences in life than good experiences, that most people spend most of their lives in some sort of state of quiet desperation, frustration, unfulfillment, general dissatisfaction with their circumstances (Buddhist dukkha), and that none of this even has any meaning. Therefore life is overall negative utility, a bad thing, immoral to bring a new child into, and those of us alive can only muddle on as best we can.This is David Benatar's 'The Human Predicament'.
But humans don't experience utility at the level of base-level rewards. Our conscious selves are defined in many ways by attention mechanisms. If we are experiencing pain in the pursuit of a long-term goal, we can (sometimes) choose to attend to the progress we are making towards the goal, and ignore the pain. In this case our conscious self experiences reward, even if there is nothing but pain at the lowest level. Importantly, this is true even if the progress is only illusory, or the goal is ultimately futile (as all goals are, ultimately, since nothing matters).A vague intuition I have for this is that the conscious self exists in a sort of higher-level MDP, in which the 'rewards' are actually some approximation of the value function for a lower-level MDP in which rewards are sparse or even mostly negative. But this hierarchy is not necessarily hard: we have the ability to attend to the lower-level rewards, or to some mixture of rewards at different levels (value fns mixed with 'true' rewards, or even value fns at different levels), so that depending on what we attend to, the world may seem generally good or generally bad.
It's tempting to say that the world really is generally bad, and that we would be fooling ourselves by attending only to the good parts, or to abstractions or stories that hide bad aspects of base-level reality. However, assuming we can maintain that attention, it feels perverse to say that a life in which the lived experience is positive and full of reward is a bad life. If bad things are happening 'to me' but no one is experiencing them (because my conscious self is focused elsewhere), are they really bad? Given that we have to attend somewhere, we would also in some sense be 'fooling ourselves' by attending to only the base-level negative rewards, which are after all not normatively privileged in any sense.A formal mathematical MDP does privilege the goal of maximizing base-level reward, but all models are wrong; it seems clear that the way to maximize reward of the actual agent in the system -- my conscious self, which includes the power of attention -- is to attend to the nice levels of the hierarchy rather than the depressing levels.
Of course, most of us don't have unlimited power to direct our attention.Though strengthening control over attention is possible through meditation). There are bad things that we cannot ignore: serious physical pain, loss and grief, etc. And sometimes we don't have good things to attend to even at the higher levels. One characteristic of depression is a loss of goals: not knowing what you want to achieve with enough clarity and confidence that you can take steps in that direction, or not even believing that any long-term goals are valuable at all.
The depressive view is not wrong: the base-level world might genuinely be bad and might continue to be bad even after any plausible long-term goal is achieved, so there really might not be anything worth doing in terms of improving the base-level world. But happiness is, like money, a bit of a mass delusion: it's the belief that things we do will improve the world. And as with money, the belief itself makes the thing real: if we fool ourselves into believing that a goal is valuable, we begin to get utility from progress towards that goal, and then the goal itself does become valuable, if only for the journey it enables. Consider sports as an example: there is no intrinsic value at all in kicking a ball around better than some other people, but by creating the mass delusion that there is, we can become invested in winning the game and actually conjure a significant amount of fun straight out of the ether.
I still think there are lessons here for AI research. I am not sure exactly what they are -- if we really do care about optimizing a base utility fn, it's not obviously a good idea to create agents with the ability to attend to intermediate value functions and perhaps in the end fool themselves into thinking they are deriving object-level utility when in fact they are not. On the other hand, the human ability to come up with our own goals, to persevere through long regions of no reward in service of higher-level goals, to shift attention flexibly across multiple levels of a goal/reward hierarchy, and the lack of any strongly-typed mathematical distinction between 'reward' and 'value' functions in whatever the brain does, seem like important sources of inspiration…
There should also be lessons here for ethics. Maximizing human utility could involve preventing or avoiding object-level pain, but also helping people distract themselves from negative experiences and focus on positive experiences -- simply shifting attention can massively change the lived experience of utility. So there is still not much value in tech that allieviates minor or even major frustrations (that we didn't even attend to before the tech was available to prevent them). And depression and grief are still some of the most negative-utility situations because they prevent us from engaging in this sort of attention-based utility-laundering. Things that help people focus on the positive or provide a longer-term sense of purpose become very valuable. (eg religion, though of course unfortunately religion isn't true which isn't a a problem per se but does make it an unreliable solution since at this point it's hard to sustain the illusion of purpose it provides -- we need more sustainable illusions).
It does seem like something is very twisted about this philosophy in which nothing is really worth doing, but where we need to believe things are worth doing in order to be happy (but then we genuinely will be!). How do we bootstrap ourselves into such beliefs, since we can't really argue for them on first principles (until we believe it, the belief will actually be false!)? I guess the answer is that we can't: this process has to engage somewhat with the pre-rational parts of the brain, and probably to be more social in nature. If everyone around you believes a goal is worthwhile, you'll tend to believe it: even if you can't rationally justify the goal on its own terms, it becomes worthwhile because steps towards it will make your friends and community happy, which is worthwhile in itself. So as long as a critical mass in a community can maintain the delusion, the sense of purpose is self-perpetuating and reinforcing within the community (conspiracy is a thing for a reason). This doesn't really resolve the question, in that it just pushes it to another level, but the fact that many people do seem to feel senses of purpose to various degrees inspired by their communities implies that it is at least possible even if we don't have a foolproof mechanism for making it happen.