[R] a question about LMS and what constitutes outliers

Rajarshi Guha rxg218 at psu.edu
Thu Oct 6 20:57:35 CEST 2005


Hi,
  I have been using the lqs function with method='lms'. However the
results I get are a little different from the results noted by Rousseeuw
& Leroy (Robust Regression and Outlier Detection) and I was wondering
how to use these results for outlier detection.

I'm using the stackloss dataset, for which the original Rousseeuw et al.
program points out that observations 1,2,3,4 and 21 are outliers.

This conslusion is arrived at by testing whether the residual is greater
than 2.5 * standard error

Netx I ran lqs as:

m <- lqs(stackloss[,-4], stackloss[,4], method='lms', control=list
(psamp=4, nsamp='exact', adjust=TRUE))

(I ran it exhaustively since that was how I ran the original program
from Rousseeuw)

The coefficients obtained from lqs() are more or less identical to that
obtained by the original program. However the scale estimates do not
match. I assume that this would be becuase of the per sample
adjustments.

Now if I want to decide whether an observation is an outlier I use the
condition

which( abs(m$resid) > 2.5 * m$scale[1] )

and this gives me

 1  2  3  4  8 13 14 20 21
 1  2  3  4  8 13 14 20 21

Now, it includes the original outliers as noted by Rousseuw, but also 4
extra ones. From a plot of the residuals I can see obs 13,14,20 possibly
being regarded as outliers but 8 seems a stretch.

I tried evaluating the above condition with m$scale[2] but I get the
same result. I also tried running lqs() with adjust=FALSE in which case
using the above condition obs 1,2,3,4,13,20,21 are regarded as outliers.

So my questions are

1) Am I correct in using the above condition to determine whether an
observation is an outlier?

2) If so, is it correct that lqs() will detect more outliers than noted
by the original book/program?

Thanks,

-------------------------------------------------------------------
Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
After an instrument has been assembled, extra components will be found
on the bench.




More information about the R-help mailing list