[R] Why does loading saved/cached objects add significantly to RAM consumption?

Janko Thyson janko.thyson.rstuff at googlemail.com
Tue Aug 30 12:59:36 CEST 2011


Dear list,

I make use of cached objects extensively for time consuming computations 
and yesterday I happened to notice some very strange behavior in that 
respect:
When I execute a given computation whose result I'd like to cache (tried 
both saving it as '.Rdata' and via package 'R.cache' which uses a own 
filetype '.Rcache'), my R session consumes about 200 MB of RAM, which is 
fine. Now, when I make use of the previously cached object (i.e. loading 
it, assigning it to a certain field of a Reference Class object), I 
noticed that RAM consumption of my R process jumps to about 250 MB!
a
Each new loading of cached/saved objects adds to that consumption (in 
total, I have about 5-8 objects that are processed this way), so at some 
point I easily get a RAM consumption of over 2 GB where I'm only at 
about 200 MB of consumption when I compute each object directly! Object 
sizes (checked with 'object.size()') remain fairly constant. What's even 
stranger: after loading cached objects and removing them (either via 
'rm()' or by assigning a 'fresh' empty object to the respective 
Reference Class field, RAM consumption remains at this high level and 
never comes down again.

I checked the behavior also in a small example which is a simplification 
of my use case and which you'll find below (checked both on Win XP and 
Win 7 32 bit). I couldn't quite reproduce an immediate increase in RAM 
consumption, but what I still find really strange is
a) why do repeated 'load()' calls result in an increase in RAM consumption?
b) why does the latter not go down again after the objects have been 
removed from '.GlobalEnv'?

Did anyone of you experience a similar behavior? Or even better, does 
anyone know why this is happening and how it might be fixed (or be 
worked around)? ;-)

I really need your help on this one as it's crucial for my thesis, 
thanks a lot for anyone replying!!

Regards,
Janko

##### EXAMPLE #####

setRefClass("A", fields=list(.PRIMARY="environment"))
setRefClass("Test", fields=list(a="A"))

obj.1 <- lapply(1:5000, function(x){
     rnorm(x)
})
names(obj.1) <- paste("sample", 1:5000, sep=".")
obj.1 <- as.environment(obj.1)

test <- new("Test", a=new("A", .PRIMARY=obj.1))
test$a$.PRIMARY$sample.10

#+++++

object.size(test)
object.size(test$a)
object.size(obj.1)
# RAM used by R session: 118 MB

save(obj.1, file="C:/obj.1.Rdata")
# Results in an object of ca. 94 MB
save(test, file="C:/test.Rdata")
# Results in an object of ca. 94 MB

##### START A NEW R SESSION #####

load("C:/test.Rdata")
# RAM consumption still fine at 115 - 118 MB

# But watch how it goes up as we repeatedly load objects
for(x in 1:5){
     load("C:/test.Rdata")
}
for(x in 1:5){
     load("C:/obj.1.Rdata")
}
# Somehow there seems to be an upper limit, though

# Removing the objects does not bring down RAM consumption
rm(obj.1)
rm(test)

##########

 > Sys.info()
                      sysname                      release
                    "Windows"                         "XP"
                      version                     nodename
"build 2600, Service Pack 3"               "ASHB-109C-02"
                      machine                        login
                        "x86"                     "wwa418"
                         user
                     "wwa418"

 > sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] codetools_0.2-8 tools_2.13.1



More information about the R-help mailing list