[R] function on columns of two arrays

Folkes, Michael Michael.Folkes at dfo-mpo.gc.ca
Tue Aug 20 20:30:03 CEST 2013


That's much more fair I failed to note that extra line of work.
Good to know looping can be avoided.
Thanks! 

-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com] 
Sent: August 20, 2013 11:11 AM
To: Folkes, Michael
Cc: R help
Subject: Re: function on columns of two arrays

Hi Michael,

I run it on my system after deleting one of the lines from method1 (as it was not necessary)

method1 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  lapply(seq_len(ncol(a1)),function(i) summary(lm(b1[,i]~a1[,i]))$coef) }

linRegFun<- function(x,y){
 res<- summary(lm(y~x))$coef}
method4 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  mapply(linRegFun,a1,b1)
  }



system.time( method1() )
#   user  system elapsed
# 67.504   0.008  67.636
 system.time(method2())
#   user  system elapsed
# 86.952   0.292  87.408
system.time(method3())
#   user  system elapsed
# 50.856   0.000  50.948
 system.time(method4())
#   user  system elapsed
# 46.444   0.000  46.526 

A.K.






----- Original Message -----
From: "Folkes, Michael" <Michael.Folkes at dfo-mpo.gc.ca>
To: "Law, Jason" <Jason.Law at portlandoregon.gov>; arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Tuesday, August 20, 2013 1:57 PM
Subject: RE: function on columns of two arrays

Here's a tiny summary of the speed results using different methods to run lm on common columns from two arrays.
Sadly looping is fastest. I don't know if any are sensitive to array dimensions (more columns, fewer layers etc).
I invite correction, or suggestions for improvement to avoid looping.
Going on vacation, so I won't reply until next Wednesday.
Thanks!
Michael

##### begin R script

#shows three ways to do lm on common columns from two arrays.
#sadly looping is fastest by a long shot

a <- array(1:60,dim=c(20,20,2000))
b <- a*3+10

#method 1
# arun [smartpink111 at yahoo.com]
method1 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  lapply(seq_len(ncol(a1)),function(i) lm(b1[,i]~a1[,i]))
  lapply(seq_len(ncol(a1)),function(i) summary(lm(b1[,i]~a1[,i]))$coef) }


#method 2
#Jason Law Statistician City of Portland
method2 <- function(){
  library(abind)
  library(plyr)
  c <- abind(a,b, along = 4)
  results <- alply(c, c(2,3), function(x) lm(x[,2] ~ x[,1]))
  ldply(results, function(x) summary(x)$coef) }  

#method3
#looping
method3 <- function(){
  results <- matrix(NA,ncol=4,nrow=2*dim(a)[2]*dim(a)[3])
  counter <- 1
for(layer in 1:dim(a)[3]){
  for(col.val in 1:dim(a)[2]){
    results[counter:(counter+1),] <-
summary(lm(b[,col.val,layer]~a[,col.val,layer]))$coef
    counter <- counter+2
  }
}
}#END method3

# system.time( method1() )
# system.time( method2() )
# system.time( method3() )
#
# > system.time( method1() )
# user  system elapsed
# 210.52    0.09  212.03
#
# > system.time( method2() )
# user  system elapsed
# 123.52    0.13  124.07
#
# > system.time( method3() )
# user  system elapsed
# 79.07    0.01   79.23 



More information about the R-help mailing list