[R] [OT] propensity score implementation
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Sun Nov 9 03:39:50 CET 2008
Wensui Liu wrote:
> Dear All,
> My question is more a statistical question than a R question. The reason I
> am posting here is that there are lots of excellent statistician on this
> list, who can always give me good advices.
> Per my understanding, the purpose of propensity score is to reduce the bias
> while estimating the treatment effect and its implementation is a 2-stage
> 1) First of all, if we assume that T = 1 if an individual belongs to
> treatment group and T = 0 otherwise. We further assume that X is a covariate
> matrix to explain the assignment of treatment. Then the propensity score
> should be the probability of treatment exposure T = 1 and can be formulated
> PPscore = Prob(T=1|X) = exp(A * X) / [1 + exp(A * X)] in the range between 0
> and 1.
> 2) At the second stage, let Y = 1 / 0 is a binary outcome variable and Z the
> covariate matrix to explain outcome. In order to balance the probability of
> an individual assigned to the treatment group such that Prob(Y = 1) _|_
> Prob(T = 1|X), we should model the outcome as
> Prob(Y = 1|Z) = exp(B * Z) / [1 + exp(B * Z)] weighting or matching by
> The above is just my general understanding about propensity score. However,
> I was critisized that my understanding is wrong and was also told that the
> response variable should be Y instead of T in the propensity model at the
> 1st stage. I am very confused and like to have the opinion of experts like
> you guys.
If the response were Y then this would not be a propensity model.
Whoever told you that is off the mark.
Think of the propensity score as a data reduction method that allows you
to model all known baseline variables against the treatment assignment
in order to remove confounding bias in all of them. Then the outcome
model can have the logit of propensity (plus nonlinear transformations
of it) as a covariate to account for confounding. The outcome model
also needs to have strong predictor variables in it to account for
outcome heterogeneity not related to confounding. You can also using
matching as you mentioned but I prefer to adjust for propensity by
covariate adjustment once I check the overlap of propensity in the two
> Any insight will be appreciated.
> Have a nice weekend!
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help