[R] shapiro.test

Sat Feb 22 08:29:40 CET 2014

Greg,

I really like that TeachingDemos::SnowsPenultimateNormalityTest()… even the tortuous way to always return a p-value == 0:

# the following function works for current implementations of R
# to my knowledge, eventually it may need to be expanded
is.rational <- function(x){
    rep( TRUE, length(x) )
}

tmp.p <- if( any(is.rational(x))) {
     0
} else {
     # current implementation will not get here if length
     # of x is positive.  This part is reserved for the
     # ultimate test
     1
}

(p.value is then returned as tmp.p). Also, the nice and sexy printing of that p-value in R as:

p-value < 2.2e-16

which looks much more serious than 'p-value = 0'… Here you has nothing to do. The stats::format.pval() function called from stats:::print.htest() already does the job for you!

I am just curious… Are there teachers out there pointing to that test? If yes, what fraction of the students realise what happens? I guess, it is closer to zero than to one, unfortunately. Wait… I need another SnowsPenultimateXxxxTest() here to check the null hypothesis that all my students are doing what they are supposed to do when discovering a new statistical tool!

Best,

Philippe Grosjean

On 21 Feb 2014, at 23:53, Greg Snow <538280 at gmail.com> wrote:

> Rui,
> 
> Note this quote from the last paragraph of the Details section of ?ks.test:
> 
> "If a single-sample test is used, the parameters specified in '...'
>     must be pre-specified and not estimated from the data."
> 
> Which is the exact opposite of your example.
> 
> 
> 
> Gonzalo,
> 
> Why are you testing your data for normality?  For large sample sizes
> the normality tests often give a meaningful answer to a meaningless
> question (for small samples they give a meaningless answer to a
> meaningful question).
> 
> If you really feel the need for a p-value then
> SnowsPenultimateNormalityTest in the TeachingDemos package will work
> for large sample sizes.  But note that the documentation for that
> function is considered more useful than the function itself.
> 
> 
> 
> On Fri, Feb 21, 2014 at 3:04 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>> Hello,
>> 
>> Not answering directly to your question, if the sample size is a documented
>> problem with shapiro.test and you want a normality test, why don't you use
>> ?ks.test?
>> 
>> m <- mean(HP_TrinityK25$V2)
>> s <- sd(HP_TrinityK25$V2)
>> 
>> ks.test(HP_TrinityK25$V2, "pnorm", m, s)
>> 
>> 
>> Hope this helps,
>> 
>> Rui Barradas
>> 
>> Em 21-02-2014 15:59, Gonzalo Villarino Pizarro escreveu:
>> 
>>> Dear R users,
>>> Please help with with this maybe basic question. I am trying to see if my
>>> data is normal but is a large file and the test does not work.
>>> I keep getting the message : "Error in shapiro.test(x = HP_TrinityK25$V2)
>>> :  sample size must be between 3 and 5000"
>>> thanks!
>>> 
>>>  shapiro.test(x=HP_TrinityK25$V2)
>>> Error in shapiro.test(x = HP_TrinityK25$V2) : sample size must be between
>>> 3
>>> and 5000
>>> 
>>> ##Note:
>>> HP_TrinityK25= my file
>>> HP_TrinityK25$V2= data in my file
>>> 
>>>        [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> 538280 at gmail.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>