[R] function on columns of two arrays

Folkes, Michael Michael.Folkes at dfo-mpo.gc.ca
Tue Aug 20 19:57:26 CEST 2013


Here's a tiny summary of the speed results using different methods to
run lm on common columns from two arrays.
Sadly looping is fastest. I don't know if any are sensitive to array
dimensions (more columns, fewer layers etc).
I invite correction, or suggestions for improvement to avoid looping.
Going on vacation, so I won't reply until next Wednesday.
Thanks!
Michael

##### begin R script

#shows three ways to do lm on common columns from two arrays.
#sadly looping is fastest by a long shot

a <- array(1:60,dim=c(20,20,2000))
b <- a*3+10

#method 1
# arun [smartpink111 at yahoo.com]
method1 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  lapply(seq_len(ncol(a1)),function(i) lm(b1[,i]~a1[,i]))
  lapply(seq_len(ncol(a1)),function(i) summary(lm(b1[,i]~a1[,i]))$coef)
}


#method 2
#Jason Law Statistician City of Portland
method2 <- function(){
  library(abind)
  library(plyr)
  c <- abind(a,b, along = 4)
  results <- alply(c, c(2,3), function(x) lm(x[,2] ~ x[,1])) 
  ldply(results, function(x) summary(x)$coef)
}  

#method3 
#looping
method3 <- function(){
  results <- matrix(NA,ncol=4,nrow=2*dim(a)[2]*dim(a)[3])
  counter <- 1
for(layer in 1:dim(a)[3]){
  for(col.val in 1:dim(a)[2]){
    results[counter:(counter+1),] <-
summary(lm(b[,col.val,layer]~a[,col.val,layer]))$coef
    counter <- counter+2
  }
}
}#END method3

# system.time( method1() )
# system.time( method2() )
# system.time( method3() )
# 
# > system.time( method1() )
# user  system elapsed 
# 210.52    0.09  212.03 
# 
# > system.time( method2() )
# user  system elapsed 
# 123.52    0.13  124.07 
# 
# > system.time( method3() )
# user  system elapsed 
# 79.07    0.01   79.23



More information about the R-help mailing list