[R] grubbs.test

Thu Apr 14 19:44:18 CEST 2005

The Grubbs test is one of many old (1950's - '70's) and classical tests for
outliers in linear regression. Here's a link:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm

I think it fair to say that such outlier detection methods were long ago
found to be deficient and have poor statistical properties and were
supplanted by (computationally much more demanding -- but who cares these
days!?) robust/resistant techniques, at least in the more straightforward
linear models contexts. rlm() in MASS (the package) is one good
implementation of these ideas in R. See MASS (the book by V&R) for a short
but informative discussion and further references.

I should add that the use of robust/resistant techniques exposes (i.e., they
exist but we statisticians get nervous talking publicly about them) many
fundamental issues about estimation vs inference, statistical modeling
strategies, etc. The problem is that important estimation and inference
issues for R/R estimators remain to be worked out -- if, indeed, it makes
sense to think about things this way at all. For example, for various kinds
of mixed effects models, "statistical learning theory" ensemble methods,
etc. The problem, as always, is what the heck does one mean by "outlier" in
these contexts. Seems to be like pornography -- "I know it when I see it."*

Contrary views cheerfully solicited!

Cheers to all,

-- Bert Gunter

*Sorry -- that's a reference to a famous quote of Justice Potter Stewart, an
American Supreme Court Justice.
http://www.michaelariens.com/ConLaw/justices/stewart.htm

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of vito muggeo
> Sent: Thursday, April 14, 2005 7:05 AM
> To: Dave Evens
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] grubbs.test
> 
> Dear Dave,
> I do not know the grubbs.test (is it a function, where can I 
> find it?) 
> and probably n=6 data points are really few..
> 
> Having said that, what do you mean as "outlier"?
> If you mean deviation from the estimated mean (of previous data), you 
> might have a look to the strucchange package..(sorry, but now 
> I do not 
> remember the exact name of the function)
> 
> best,
> vito
> 
> 
> Dave Evens wrote:
> > Dear All,
> > 
> > I have small samples of data (between 6 and 15) for
> > numerious time series points. I am assuming the data
> > for each time point is normally distributed. The
> > problem is that the data arrvies sporadically and I
> > would like to detect the number of outliers after I
> > have six data points for any time period. Essentially,
> > I would like to detect the number of outliers when I
> > have 6 data points then test whether there are any
> > ouliers. If so, remove the outliers, and wait until I
> > have at least 6 data points or when the sample size
> > increases and test again whether there are any
> > outliers. This process is repeated until there are no
> > more data points to add to the sample.
> > 
> > Is it valid to use the grubbs.test in this way?
> > 
> > If not, are there any tests out there that might be
> > appropriate for this situation? Rosner's test required
> > that I have at least 25 data points which I don't
> > have.
> > 
> > Thank you in advance for any help.
> > 
> > Dave
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> > 
> 
> -- 
> ====================================
> Vito M.R. Muggeo
> Dip.to Sc Statist e Matem `Vianelli'
> Università di Palermo
> viale delle Scienze, edificio 13
> 90121 Palermo - ITALY
> tel: 091 6626240
> fax: 091 485726/485612
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>