[R] run function on subsets of matrix

David Winsemius dwinsemius at comcast.net
Sun Mar 27 08:25:16 CEST 2011


On Mar 26, 2011, at 10:26 PM, fisken wrote:

> I was wondering if it is possible to do the following in a smarter  
> way.
>
> I want get the mean value across the columns of a matrix, but I want
> to do this on subrows of the matrix, given by some vector(same length
> as the the number of rows). Something like
>
> nObs<- 6
> nDim <- 4
> m  <-   matrix(rnorm(nObs*nDim),ncol=nDim)
> fac<-sample(1:(nObs/2),nObs,rep=T)
>
> ##loop trough different 'factor' levels
> for (i in unique(fac))
>    print(apply(m[fac==i,],2,mean))

This would be a lot simpler and faster:

  colMeans(m[unique(fac),])

#[1]  1.3595197 -0.1374411  0.1062527 -0.3897732

>
> Now, the problem is that if a value in 'fac' only occurs once, the
> 'apply' function will complain.

Because "[" will drop single dimensions and so the matrix becomes a  
vector and looses the number-2 margin. Use drop=FALSE to prevent this,  
and note the extra comma:

print(apply(m[1, , drop=FALSE],2,mean))

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list