[R] LSD, HSD,...
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Jul 17 12:43:15 CEST 2007
On Tue, 17 Jul 2007, John Maindonald wrote:
> <follow-on rant> Stepwise regression variable selection
> methods make multiple post hoc comparisons. The
Some do, but step() (the only way offered in base R) does not test at all.
> number of comparisons may be very large, vastly more
> than the half-dozen post-hoc comparisons that are
> common in an experimental design context.
> There is a disconnect here. The multiple testing issue is
> noted in pretty much every discussion of analysis of
> experimental data, but not commonly mentioned (at least
> in older texts) in discussions of stepwise regression, best
> subsets and related regression approaches. One reason
> for this silence may be that there is no ready HSD-like fix.
> The SEs and t-statistics that lm() gives for the finally
> selected model can be grossly optimistic. Running the
> analysis with the same model matrix, but with y-values
> that are noise, can give a useful wake-up call.
Predictions from any single model will also have 'optimistic' standard
errors. The major problem is attempting to select a single model and
there is also a problem with assuming the model to be true (which
Huber-White so-called 'sandwich' estimators try to avoid, and robust
fitting does so more comprehensively). If you really want to assess
uncertainty you need to take into account that the models are false and
that several models may capture different aspects of the data and so be
false in different ways.
> John Maindonald email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473 fax : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> On 16 Jul 2007, at 8:00 PM, Simon Blomberg wrote:
>> If you have a priori planned comparisons, you can just test those
>> linear contrasts, with no need to correct for multiple testing. If you
>> do not, and you are relying on looking at the data and analysis to
>> you which treatment means to compare, and you are considering several
>> tests, then you should consider correcting for multiple testing. There
>> is a large literature on the properties of the various tests.
>> (Tukey HSD
>> usually works pretty well for me.)
>> <rant> Why do people design experiments with a priori hypotheses in
>> mind, yet test them using post hoc comparison procedures? It's as if
>> they are afraid to admit that they had hypotheses to begin with! Far
>> better to test what you had planned to test using the more powerful
>> methods for planned comparisons, and leave it at that.
>> On Mon, 2007-07-16 at 09:52 +0200, Adrian J. Montero Calvo wrote:
>>> I'm designing a experiment in order to compare the growing of
>>> several clones of a tree specie. It will be a complete randomized
>>> design. How can I decide what model of mean comparision to choose?
>>> HSD,TukeyHSD, Duncan,...? Thanks in advance
>>> R-help at stat.math.ethz.ch mailing list
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> and provide commented, minimal, self-contained, reproducible code.
>> Simon Blomberg, BSc (Hons), PhD, MAppStat.
>> Lecturer and Consultant Statistician
>> Faculty of Biological and Chemical Sciences
>> The University of Queensland
>> St. Lucia Queensland 4072
>> Room 320 Goddard Building (8)
>> T: +61 7 3365 2506
>> email: S.Blomberg1_at_uq.edu.au
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help