The central role of the propensity score in observational studies for causal effects: summary

Observational studies create barriers for causal studies because of several reasons but primarily, selection bias - a lack of randomisation implies that the potential outcomes are not independent of treatment assignment, and also that units exposed to treatment could (and usually do) differ fundamentally from the control units.

The paper defines a balancing score as any function of the covariates (which are observed pretreatment or simply: covariates), such that the conditional distribution of x given this score is the same for treated and control units. X (the covariates) itself is balancing, but the objective is to find a many-one balancing score or coarsest to bring about dimension reduction.
Now, the propensity score \(e(x) = Pr(Z = 1/ X = x))\) for non randomised studies does not have a specific form, as opposed to in randomised studies where this probability is known. However, it’s possible to estimate the score from the data (Using for example a generalised linear model like logit). Moreover, propensity score is a balancing score - in fact, it is the ‘finest’ balancing score - it can be represented as a function of any other balancing score.

If the treatment assignment is ‘strongly ignorable’ - that is, conditional on X (covariates), potential outcomes are independent of assignment, then it is true that conditional on the propensity score, potential outcomes are independent of assignment.

Conditioning on propensity score gives us an unbiased estimate. Units with the same value of propensity score but different treatments can act as controls for each other; the expected difference in their response equals the average treatment effect. The applications of propensity score extends to pair matching, stratification (discretizing propensity scores) and covariance adjustment.

The paper discussed three main applications of propensity scores: Pair matching: This means having pairs of treated and control units in our data. First, randomly sample a propensity score, and then get a treatment and control unit with the matching propensity score value. Paired matching given propensity score has lower variance: we saw the exact formulas in lecture. Covariance adjustment: With X as the propensity score, within each strata of propensity score we can do a linear regression: this is the same as OLS with interaction, or Lin’s estimator.

Stratification: Conditional on propensity score, we can estimate ATE like we would in a stratified experiment. In stratification, using propensity score and not actual covariates is useful because in some cases, stratifying on all covariates can create many more groups which could lead to a situation where within a strata there are no treated or control groups. If \(e(X)\) is continuous, we can discretise the probabilities. For most observational studies, it was found that the optimal number for discretization was five.

The rationale for the use of propensity score in observational studies were several, including: Conditioning on the propensity score makes pair matching for intuitive, and makes the study easier to understand for people with limited statistical background Model-based adjustment become more robust even if ‘true’ model of regressing outcomes on propensity score is incorrect.

If you have large control group reservoir: say if you are interested in the health of workers working in a Nuclear plant, then it easier to stratify on propensity score, and then sample from the control group, then to do random subgrouping. Of course it is possible to have residual bias in the propensity score: but this itself implies that the estimate for ATE is also biased.

To summarise: propensity score can be understood as how much ‘importance’ the treatment mechanism is giving to treatment outcome: it signifies the probability that we will observe Y(1), not Y(0). Fitting a model on E(Z/X) can give us unbiased estimates of propensity, and under strong ignorability, there are various ways to estimate the average causal effect in an otherwise non randomised observational study, through pair matching, stratification and covariance adjustment.