[R] Checking a large data set for normality

David Winsemius dwinsemius at comcast.net
Thu Sep 26 01:08:40 CEST 2013


On Sep 25, 2013, at 5:55 AM, steric wrote:

> It was just a question to see if it was possible on a large data  
> set. I
> wasn't looking for a flame war. New strategy is simply new strategy.  
> Half of
> my time spent on R is trying to find a better way.
>
If you say why you want to do a global test with such a function and  
what function you had in mind for a single vector, then perhaps:

lapply( dfrm[sapply(dfrm, is.numeric)],  
your_preferred_normaility_check_fun)


If you specify the statistical issues, then there might be a more  
specific response. There is a long history of people coming to Rhelp  
asking for tests of normality and being given similar advice as I  
gave. There are also many people who mistakenly believe that you need  
normality of predictor variables to do regression.

There is en entire Task View on robust methods. There is also:

install.packages(TeachingDemos)
library(TeachingDemos)
?SnowsPenultimateNormalityTest

You are asked in the Posting Guide to do some searching of the  
archives. I use Markmail for my searches but others use the Newcastle  
search site or Rseek.

-- 
David Winsemius, MD
Alameda, CA, USA



More information about the R-help mailing list