[R] sciplot question

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sun May 24 17:39:27 CEST 2009

spencerg wrote:
> Dear Frank, et al.:
> Frank E Harrell Jr wrote:
>> <snip>
>> Yes; I do see a normal distribution about once every 10 years.
>      To what do you attribute the nonnormality you see in most cases? 
>           (1) Unmodeled components of variance that can generate errors 
> in interpretation if ignored, even with bootstrapping?
>           (2) Honest outliers that do not relate to the phenomena of 
> interest and would better be removed through improved checks on data 
> quality, but where bootstrapping is appropriate (provided the data are 
> not also contaminated with (1))?
>           (3) Situations where the physical application dictates a 
> different distribution such as binomial, lognormal, gamma, etc., 
> possibly also contaminated with (1) and (2)?
>      I've fit mixtures of normals to data before, but one needs to be 
> careful about not carrying that to extremes, as the mixture may be a 
> result of (1) and therefore not replicable.
>      George Box once remarked that he thought most designed experiments 
> included split plotting that had been ignored in the analysis.  That is 
> only a special case of (1).
>      Thanks,
>      Spencer Graves


Those are all important reasons for non-normality of margin 
distributions.  But the biggest reason of all is that the underlying 
process did not know about the normal distribution.  Normality in raw 
data is usually an accident.


Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

More information about the R-help mailing list