[R] Is it ok to apply the z.test this way?

Greg Snow Greg.Snow at imail.org
Fri Apr 16 21:07:00 CEST 2010


Several points:

1. The Shapiro test does not tell you that something is normal or highly normal, only that you don't have enough evidence to disprove that the data came from a normal population (powered for a certain type of deviation from normality).

2. The z.test function is intended to be used as a stepping stone in learning for students, a simple test with unrealistic assumptions to get the ideas, then relax the assumptions and learn about t tests and others.

3.  The z test is only used when the population standard deviation is known, you calculate the sd from the data, that is what t tests are for.

4.  Calculating the hypothesized mean from the data is backwards.

5.  using a sample size of 1 is questionable, doing this 1,000 times without correction is even more questionable.

6.  Your code is equivalent to:

tmp <- seq(0,1, by=0.001)
tmp2 <- tmp[ abs(tmp-mean(Distribution))/sd(Distribution) > 1.96 ]

just slower and less memory efficient.

7. None of this establishes what is from an unknown distribution.

If you can tell us what your real question is, then maybe we can help with a real solution.

So to answer your question of if it is ok to use z.test in that way: Leagally the license says you can use it anyway you want, ethically/morally/aesthetically/or following the intent of the author, No!

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Atte Tenkanen
> Sent: Friday, April 16, 2010 10:11 AM
> To: r-help at r-project.org
> Subject: [R] Is it ok to apply the z.test this way?
> 
> Dear R-users,
> 
> I want to check if certain values are from random distribution, that
> includes values between 0-1. So, it is not really normal even though
> shapiro.test says it is highly normal... Can I do something like this
> and think that the values given are right. z.test is from package
> TeachingDemos.
> -----------------------------------------------------------------------
> --------
> SelectedVals=c()
> for(i in seq(0,1,by=0.001))
> {
> 	if((z.test(i, mu=mean(Distribution),
> stdev=sd(Distribution))$p.value)<=0.05) SelectedVals=c(SelectedVals,i)
> }
> 
> -----------------------------------------------------------------------
> --------
> I have marked the border values given by this script to the histogram
> of the original random distribution:
> 
> http://www.ag.fimug.fi/~Atte/62Hist100410.pdf
> 
> Atte Tenkanen
> University of Turku, Finland
> Department of Musicology
> +35823335278
> http://users.utu.fi/attenka/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list