[R] Specification of factors in tapply
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Feb 21 17:13:54 CET 2001
On Wed, 21 Feb 2001 rijn at swi.psy.uva.nl wrote:
>
> After some fiddling around with the tapply command, I discovered that the
> factors (the INDEX argument) given to tapply must be specified in
> fastest-cycling first order.
Wait a minute: the factors can be supplied in any order. If you ignore
the structure of the result you will get confused, though.
There are two cases:
1) `FUN' returns a single atomic value. The result is a simple array, and
is specified on the help page.
2) FUN' does not return a single atomic value. The result is a list with a
dim attribute, so an array each of whose elements is a vector (and a list
is a vector).
I think you are just discovering that if you collapse an array to a
vector, you get the results in Fortran order.
>
> The following code shows how I discovered my error: (R version 1.2.1)
>
> -o-o-o-o-o-
>
> x <- as.data.frame(list(data=c(-9,0,3,1,-9,1,0,-9,0,3,1,-9,1,0),
> subj=c(rep(1,7),rep(2,7)),
> cond=rep(c(rep(1,4),rep(2,3)),2)))
>
> x$first <- unlist(tapply(x$data,list(x$subj,x$cond),
> function(x) {
> retval<-rep(F,length(x));
> if (length(x[x>=0])>0) {
> retval[min(which(x>=0))]<-T;
> }
> print(cbind(x,retval)); # Print some debug info
> retval}))
The order is only relevant because you unlisted an array. Nobody
said that you could add the results to x after unlisting: that's an
assumption.
> -o-o-o-o-
>
> resulting in:
>
> > x
> data subj cond first
> 1 -9 1 1 FALSE
> 2 0 1 1 TRUE
> 3 3 1 1 FALSE
> 4 1 1 1 FALSE
> 5 -9 1 2 FALSE
> 6 1 1 2 TRUE
> 7 0 1 2 FALSE
> 8 -9 2 1 FALSE
> 9 0 2 1 FALSE # <--
> 10 3 2 1 TRUE # <--
> 11 1 2 1 FALSE
> 12 -9 2 2 FALSE
> 13 1 2 2 TRUE
> 14 0 2 2 FALSE
>
> I could not find any reference to this order in the tapply help file nor
> in "An Introduction to R" (Version 1.2.1 (2001-01-15), PDF file p17), it
> might prove useful to include some information about this.
The ordering seems to me to be nothing to do with tapply: that returns an
array with dimnames referring to the cells used.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list