[R] missing ctest and methodological question

Thomas Lumley tlumley at u.washington.edu
Mon Apr 23 17:49:11 CEST 2001


On 23 Apr 2001, Christof Meigen wrote:

>
>
> Hi,
>
> I couldn't figure out how to use the functions from the
> ctest library. I'm using the r-base package that comes with
> debian potato. library("ctest") told me that no such package
> existed. I checked the CRAN, but no such package was
> availiable, instead I was told that it would be part of the
> standard installation. But functions from ctest like
> shapiro-wilk don't work.  The only thing I found was a
> ASCII-file "ctest" which I could load but was only a wrapper
> for some C-Library which seems to be not installed.

It's been a long time so I'm not sure, but it may be that the version of R
in 'potato' precedes the addition of ctest to the base package.  In any
case you want to upgrade, and I haven't heard of library("ctest") not
working on any recent systems (in fact we have people complaining that
they can't get rid of it...)

> Besides that technical problem I would also appreciate your
> scientific advice. For a publication a have to check about
> 80 samples with about 3000-5000 values each whether they
> a normally distributed (N(0,1) to be exact). The problem is,
> that they are derived from discrete measurements (the
> weight of children, for example, where nearly all values
> have full or half kilograms), so Kolmogorov-Smirnov doesn't
> seem to be the right choice. Shapiro-Wilk, however,
> is limites to 5000 values, for good reasons, I think.

Well, if they are discrete then they *aren't* normally distributed. Even
the Shapiro-Wilk test will reject for discretised Normal data, though
slightly less powerfully than the Kolmogorov-Smirnov.

You could try the skewness and kurtosis based tests of normality, or you
could write a function that computes the CDF of a suitably discretised
Normal distribution and compare your data to that using ks.test(), which
allows you to specify any CDF.  This is slightly cheating as you will
presumably be using estimated parameters in computing this CDF, but with
3000 cases it should all come out in the wash.

This is all based on the assumption that you do in fact want a normality
test, an assumption I am usually suspicious of...


	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list