[R] Re: [S] tapply for matrices
Frank E Harrell Jr
fharrell at virginia.edu
Thu Oct 10 21:50:57 CEST 2002
Tony Plate provided what seems to be a very fast and elegant solution - see below. I have modified his solution slightly:
mapply <- function(X, INDEX, FUN=NULL, ..., simplify=TRUE) {
## Matrix tapply
## X: matrix with n rows; INDEX: vector or list of vectors of length n
## FUN: function to operate on submatrices of x by INDEX
## ...: arguments to FUN; simplify: see sapply
## Modification of code by Tony Plate <tplate at blackmesacapital.com> 10Oct02
idx.list <- tapply(seq(nrow(X)), INDEX, c)
sapply(idx.list, function(idx,x,fun,...) fun(x[idx,,drop=FALSE],...),
x=X, fun=FUN, ..., simplify=simplify)
}
Example: mapply(x, groups, quantile, probs=c(.25,.5)) will create a matrix of first and second quartiles of submatrices of x grouped by groups.
The usages I have for this right now are certain within-subject bivariate summaries when subjects have multiple rows of data.
Thanks Tony,
Frank
P.S. Dave Krantz <dhk at paradox.psych.columbia.edu> reported that he wrote a function mtapply that uses for loops for this but that pays a lot of attention to formatting the output as an array with sensible dimnames.
On Thu, 10 Oct 2002 12:51:54 -0600
Tony Plate <tplate at blackmesacapital.com> wrote:
> I use the following idiom for this:
>
> idx.list <- tapply(seq(numRows(x)), x[,grouping.variable], c)
> lapply(idx.list, function(idx, x) {
> submatrix <- x[idx,,drop=F]
> ... operate on submatrix ...
> }, x)
>
> which seems pretty fast. I sometimes sort x beforehand so that rows with
> the same value of the grouping variable are adjacent.
>
> Hope this helps,
>
> Tony Plate
>
> PS. Please excuse me if the above code has any typos -- it's from memory.
>
> At 02:31 PM 10/10/2002 -0400, you wrote:
> >Does anyone have something like tapply that is extremely fast for matrices
> >when there is a very large number of levels of the grouping variable?
> >I'm referring to, for example,
> >
> >tapply(x, grouping.variable, function.operating.on.submatrix)
> >
> >where x is a matrix and the submatrix is a subset of the rows of x. The
> >grouping variable's length equals the number of rows of x.
> >--
--
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list