[R] apply vs sapply vs loop - lm() call appl(y)ied on array

Wiener, Matthew matthew_wiener at merck.com
Thu Apr 21 14:53:33 CEST 2005


Christoph --

There was just a thread on this earlier this week.  You can search in the
archives for the title:   "refitting lm() with same x, different y".

(Actually, it doesn't turn up in the R site search yet, at least for me.
But if you just go to the archive of recent messages, available through
CRAN, you can search on refitting and find it.  The original post was from
William Valdar, on April 19.)

Hope this helps,

Matt Wiener

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Christoph Lehmann
Sent: Thursday, April 21, 2005 9:24 AM
To: R-help at stat.math.ethz.ch
Subject: [R] apply vs sapply vs loop - lm() call appl(y)ied on array


Dear useRs

(Code of the now mentioned small example is below)

I have 7 * 8 * 9 = 504 series of data (each length 5). For each of 
theses series I want to compute a lm(), where the designmatrx X is the 
same for all these computations.

The 504 series are in an array of dimension d.dim <- c(5, 7, 8, 9)
means, the first dimension holds the data-series.

The lm computation needs performance optimization, since in fact the 
dimensions are much larger. I compared the following approaches:

using a for-loop. using apply, and using sapply. All of these require 
roughly the same time of computation. I was astonished since I expected 
at least sapply to outperfomr the for-loop.

Do you have me another solution, which is faster? many thanks

here is the code
## ------------------------------------------------------
t.length <- 5
d.dim <- c(t.length,7,8,9) # dimesions: time, x, y, z
Y <- array( rep(1:t.length, prod(d.dim)) + rnorm(prod(d.dim), 0, 0.1), 
d.dim)
X <- c(1,3,2,4,5)

## -------- performance tests
## using for loop
date()
z <- rep(0, prod(d.dim[2:4]))
l <- 0
for (i in 1:dim(Y)[4])
  for (j in 1:dim(Y)[3])
   for (k in 1:dim(Y)[2]) {
     l <- l + 1
     z[l] <- unlist(summary(lm(Y[,k, j, i] ~ X)))$r.squared
   }
date()

## using apply
date()
z <- apply(Y, 2:4, function(x) unlist(summary(lm(x ~ X)))$r.squared)
date()

## using sapply
date()
fac <- rep(1:prod(d.dim[2:4]), rep(t.length, prod(d.dim[2:4])))
z <- sapply(split(as.vector(Y), fac), FUN = function(x) 
unlist(summary(lm(x ~ X)))$r.squared)
dim(z) <- d.dim[2:4]
date()

## ------------------------------------------------------

-- 
Christoph Lehmann                            Phone:  ++41 31 930 93 83
Department of Psychiatric Neurophysiology    Mobile: ++41 76 570 28 00
University Hospital of Clinical Psychiatry   Fax:    ++41 31 930 99 61
Waldau                                            lehmann at puk.unibe.ch
CH-3000 Bern 60         http://www.puk.unibe.ch/cl/pn_ni_cv_cl_04.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list