[R] Question

Tue Dec 14 19:03:58 CET 2010

What about thoughtless un-nasty comments?

I would suggest the original poster (and others thinking that they want to do normality testing) read the help page for SnowsPenultimateNormalityTest in the TeachingDemos package, which I think agrees mostly with what Bert wrote below (though it goes more with the idea that data "always" deviates from normality, but that may be differences in interpreting the language rather than the theory).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Bert Gunter
> Sent: Tuesday, December 14, 2010 10:12 AM
> To: Gerrit Eichner
> Cc: r-help at r-project.org; Matthew Rosett
> Subject: Re: [R] Question
> 
> ... (in addition to the very useful suggestion to plot your data):
> 
> (Sounds like a homework question... ?).
> 
> Sigh..... [mount soapbox]
> 
> 1. "Data" never deviate from normality. They only provide provide
> evidence to challenge ("test" is the formal term) the assumption that
> the population from which the data were sampled (how? -- see below)
> can be modeled as normal (e.g. whether the data provide strong
> evidence against this assumption). This is a philosophical brain
> twister, I know; but understanding what it means is actually very
> important for how one uses evidence (data) to inform science. It took
> me about 20 years after grad school to (partially, anyway) figure it
> out. Bear of little brain and all that..
> 
> 2. Define: "Deviate from normality." With a sample of 1000, normality
> tests at conventional significance levels will typically come out
> statistically significant/contradict normality (which is why a whole
> school of statistics, the gang of Bayesians, do not think that
> "statistical significance" and "evidence in the data" have much to do
> with one another). But that's not the real question, is it?
> 
> 3. The real question is: Does whatever I do to analyze the data and
> draw scientific conclusions depend crucially on the assumption of
> normality of the underlying population from which the data are
> sampled? Of course, it depends on exactly what you do, but, by and
> large,  basic statistical texts continue to teach that the answer is
> yes. Unfortunately, that is mostly (not always -- and it depends on
> what's at issue) a lie, as we have known for about 50 years. The
> crucial matter in practice is not normality but how the sampled data
> were obtained: the study design and, especially, the issue of
> "independence." Unfortunately, that is rather complicated to deal
> with, so the Intro Stats texts prefer to ignore it and teach hogwash.
> 
> [dismount soapbox]
> 
> Thoughtful nasty rejoinders welcome. Please send your thought-less
> nasty ones to me privately to spare our colleagues.
> 
> Cheers,
> Bert
> 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.