[R] Identify Leverage Points
noahsilverman at ucla.edu
Sat Jul 20 00:14:57 CEST 2013
I'm working on some fairly standard regression models (linear, logistic, and poisson.) Unfortunately, the data is rather messy.
A visual inspection, using either a histogram or a density plot indicates some significant outliers. Furthermore, summary statistics of the data indicate the same thing.
If I fit a linear regression in R using the "lm" command, I can then plot the model to look at residuals, etc.
I'm interesting in re-fitting the model with a N% of the high leverage points removed. (Large data set, want to fit "most" of the data.)
Is there a computational way to get the leverage for each data point? That way I can subset the data skipping N% of the highest leverage ones.
Noah Silverman, M.S., C.Phil
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095
More information about the R-help