[R] Identify Leverage Points

Noah Silverman noahsilverman at ucla.edu
Sat Jul 20 00:14:57 CEST 2013


Hello,

I'm working on some fairly standard regression models (linear, logistic, and poisson.)  Unfortunately, the data is rather messy. 

A visual inspection, using either a histogram or a density plot indicates some significant outliers.  Furthermore, summary statistics of the data indicate the same thing.

If I fit a linear regression in R using the "lm" command, I can then plot the model to look at residuals, etc.

I'm interesting in re-fitting the model with a N% of the high leverage points removed.   (Large data set, want to fit "most" of the data.)

Is there a computational way to get the leverage for each data point?  That way I can subset the data skipping N% of the highest leverage ones.


Thanks!


--
Noah Silverman, M.S., C.Phil
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095



More information about the R-help mailing list