[R] my ugly apply/sweep code needs help

Gabor Grothendieck ggrothendieck at gmail.com
Sat May 19 06:22:24 CEST 2007


Please include test data in your posts.  We define
sweep.med to perform the sweep on an entire matrix.  Then
we lapply f over group.sel where f(g) combines a column of all
g with sweep.med applied to the submatrix of data.mat whose
rows correspond to group.vec of g.

sweep.median2 <- function(data.mat, group.vec, group.sel) {
   sweep.med <- function(x) sweep(x, 2, apply(x, 2, median))
   f <- function(g) cbind(g+0, sweep.med(data.mat[group.vec == g,,drop
= FALSE ]))
   do.call(rbind, lapply(group.sel, f))
}

# test
mat <- matrix(1:24, 6)
group.sel <- 1:2
group.vec <- rep(1:3, 2)

sweep.median(data.mat, group.vec, group.sel)
sweep.median2(data.mat, group.vec, group.sel)


On 5/18/07, Tyler Smith <tyler.smith at mail.mcgill.ca> wrote:
> Hi,
>
> I have a matrix of data from from several groups. I need to center the
> data by group, subtracting the group median from each value, initially
> for two groups at a time. I have a working function to do this, but it
> looks quite inelegant. There must be a more straightforward way to do
> this, but I always get tangled up in apply/sweep/subset
> operations. Any suggestions welcome!
>
> Thanks,
>
> Tyler
>
> My code:
>
> Notes: data.mat is an nxm matrix of data. group.vec is a vector of
> length n with grouping factors. group.sel is a vector of length 2 of
> the groups to include in the analysis.
>
> sweep.median <- function (data.mat, group.vec, group.sel) {
>
>  data.sub1 <- data.mat[group.vec %in% group.sel[1],]
>  data.sub2 <- data.mat[group.vec %in% group.sel[2],]
>
>  data.sub1.med <- apply(data.sub1, MAR=2, median)
>  data.sub1.cent <- sweep(data.sub1, MARGIN=2, data.sub1.med)
>
>  data.sub2.med <- apply(data.sub2, MAR=2, median)
>  data.sub2.cent <- sweep(data.sub2, MARGIN=2, data.sub2.med)
>
>  data.comb <- rbind(data.sub1.cent, data.sub2.cent)
>  data.comb <- cbind(c(rep(group.sel[1],nrow(data.sub1.cent)),
>                       rep(group.sel[2],nrow(data.sub2.cent))),
>                     data.comb)
>
>  return(data.comb)
> }
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list