[Rd] efficiency and memory use of S4 data objects

Gordon Smyth smyth at wehi.edu.au
Thu Aug 21 20:12:37 MEST 2003


I do lots of analyses on large microarray data sets so memory use and speed 
and both important issues for me. I have been trying to estimate the 
overheads associated with using formal S4 data objects instead of ordinary 
lists for large data objects. In some simple experiments (using R 1.7.1 in 
Windows 2000) with large but simple objects it seems that giving a data 
object a formal class definition and using extractor and assignment 
functions may increase both memory usage and the time taken by simple 
numeric operations by several fold.

Here is a test function which uses a list representation to add 1 to the 
elements of a long numeric vector:

addlist <- function(len,iter) {
    object <- list(x=rnorm(len))
    for (i in 1:iter) object$x <- object$x+1
    object
}

Typical times on my machine are:

 > system.time(a <- addlist(10^6,10))
[1] 2.91 0.00 2.96   NA   NA
 > system.time(addlist(10^7,10))
[1] 28.03  0.44 28.65    NA    NA

Here is a test function doing the same operation with a formal S4 data 
representation:

addS4 <- function(len,iter) {
   object <- new("MyClass",x=rnorm(len))
   for (i in 1:iter) x(object) <- x(object)+1
   object
}

The timing with len=10^6 increases to

 > system.time(a <- addS4(10^6,10))
[1] 6.79 0.06 6.90   NA   NA

With len=10^7 the operation fails altogether due to insufficient memory 
after thrashing around with virtual memory for a very long time.

I guess I'm not surprised by the performance penalty with S4. My question 
is: is the performance penalty likely to be an ongoing feature of using S4 
methods or will it likely go away in future versions of R?

Thanks
Gordon

Here are my S4 definitions:

setClass("MyClass",representation(x="numeric"))
setGeneric("x",function(object) standardGeneric("x"))
setMethod("x","MyClass",function(object) object at x)
setGeneric("x<-", function(object, value) standardGeneric("x<-"))
setReplaceMethod("x","MyClass",function(object,value) {object at x <- value; 
return(object)})

 > version
             _
platform i386-pc-mingw32
arch     i386
os       mingw32
system   i386, mingw32
status
major    1
minor    7.1
year     2003
month    06
day      16
language R



More information about the R-devel mailing list