potential outcomes: Nonlinear Function
Created: August 02, 2021
Modified: August 06, 2021

potential outcomes

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Different experimental conditions may give rise to different outcomes. For example, let the variable C0C_0 indicate whether a person is healthy, given that they took vitamin C, and let C1C_1 indicate that whether they are healthy, given that they did not take vitamin C. We call C0C_0 and C1C_1 the potential outcomes, and we could define E[C1]E[C0]E[C_1] - E[C_0] as the 'causal effect' of taking vitamin C.

Conceptually, the expectation above is over subjects in a population (effectively epistemic uncertainty over the factors that make one subject differ from another), and any within-subject aleatoric uncertainty.

In reality, we observe only one of the two experimental conditions. We never observe both outcomes, so we can never directly calculate the causal effect for an individual. However, if we assume a population of iid subjects, then we can estimate the average effect E[C1]E[C_1] and E[C0]E[C_0] by splitting the population into a treatment and a control group, respectively.

The causal effect 'for a specific individual' is effectively a counterfactual.

In do-calculus notation, we'd write C0=p(Ydo(X=0))C_0 = p(Y | do(X=0)) and C1=p(Ydo(X=1))C_1 = p(Y | do(X=1)) where YY indicates health and XX indicates taking vitamin C.

Suppose we have two subjects, with |Subject | X|Y|C0C_0|C1C_1 | |----|----|----|----|----| |A| 0 | 0 | 0 |0| |B |1 | 1 |1| 1

where * denotes an unobserved potential outcome. We see that there is a perfect correlation between taking vitamin C and staying healthy. But we also see that the causal effect is zero (0.50.5=00.5 - 0.5 = 0), because neither subject's health would have changed in the counterfactual world.

The causal effect 'for a specific individual' is effectively a counterfactual.

In do-calculus notation, we'd write C0=p(Ydo(X=0))C_0 = p(Y | do(X=0)) and C1=p(Ydo(X=1))C_1 = p(Y | do(X=1)) where YY indicates health and XX indicates taking vitamin C.