Modified: August 06, 2021
potential outcomes
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.Different experimental conditions may give rise to different outcomes. For example, let the variable indicate whether a person is healthy, given that they took vitamin C, and let indicate that whether they are healthy, given that they did not take vitamin C. We call and the potential outcomes, and we could define as the 'causal effect' of taking vitamin C.
Conceptually, the expectation above is over subjects in a population (effectively epistemic uncertainty over the factors that make one subject differ from another), and any within-subject aleatoric uncertainty.
In reality, we observe only one of the two experimental conditions. We never observe both outcomes, so we can never directly calculate the causal effect for an individual. However, if we assume a population of iid subjects, then we can estimate the average effect and by splitting the population into a treatment and a control group, respectively.
The causal effect 'for a specific individual' is effectively a counterfactual.
In do-calculus notation, we'd write and where indicates health and indicates taking vitamin C.
Suppose we have two subjects, with |Subject | X|Y|| | |----|----|----|----|----| |A| 0 | 0 | 0 |0| |B |1 | 1 |1| 1
where *
denotes an unobserved potential outcome. We see that there is a perfect correlation between taking vitamin C and staying healthy. But we also see that the causal effect is zero (), because neither subject's health would have changed in the counterfactual world.
The causal effect 'for a specific individual' is effectively a counterfactual.
In do-calculus notation, we'd write and where indicates health and indicates taking vitamin C.