[Rd] unexpectedly high memory use in R 2.14.0

Henrik Bengtsson hb at biostat.ucsf.edu
Thu Apr 12 03:02:13 CEST 2012


Leaving aside what's going on inside abind::abind(), maybe the
following sheds some light on what's is being wasted:

# Preallocate (probably doesn't make a difference because it's a list)
mat.data <- vector("list", length=length(files));
for (j in 1:length(files)){
     vars <- load(file.path(dump.dir, files[j]))
     mat.data[[j]]<-data;
      # Not needed anymore/remove everything loaded
     rm(list=vars);
}

data <- abind(mat.data, along=2);
# Not needed anymore
rm(mat.data);

save(data, file.path(dump.dir, filename))

My $.02
/Henrik

On Wed, Apr 11, 2012 at 3:53 PM, andre zege <andre.zege at gmail.com> wrote:
> I recently started using R 2.14.0 on a new machine and i am  experiencing
> what seems like unusually greedy memory use. It happens all the time, but
> to give a specific example, let's say i run the following code
>
> --------
>
> for(j in 1:length(files)){
>      load(file.path(dump.dir, files[j]))
>      mat.data[[j]]<-data
> }
> save(abind(mat.data, along=2), file.path(dump.dir, filename))
>
> ---------
>
> It loads parts of multidimensional matrix into a list, then binds it along
> second dimension and saves on disk. Code works, although slowly, but what's
> strange is the amount of memory it uses.
> In particular, each chunk of data is between 50M to 100M, and altogether
> the binded matrix is 1.3G. One would expect that R would use roughly double
> that memory - to keep mat.data and its binded version separately, or 1G. I
> could imagine that for somehow it could use 3 times the size of matrix. But
> in fact it uses more than 5.5 times (almost all of my physical memory) and
> i think is swapping a lot to disk . For this particular task, my top output
> shows eating more than 7G of memory and using up 11G of virtual memory as
> well
>
> $top
>
> PID    USER      PR  NI  VIRT    RES  SHR   S %CPU %MEM    TIME+  COMMAND
> 8823  user        25   0  11g     7.2g  10m   R   99.7     92.9
> 5:55.05
> R
>
> 8590   root       15   0  154m   16m   5948  S  0.5      0.2
> 23:22.40 Xorg
>
>
> I have strong suspicion that something is off with my R binary, i don't
> think i experienced things like that in a long time. Is this in line with
> what i am supposed to experience? Are there any ideas for diagnosing what
> is going on?
> Would appreciate any suggestions
>
> Thanks
> Andre
>
>
> ==================================
>
> Here is what i am running on:
>
>
> CentOS release 5.5 (Final)
>
>
>> sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods   base
>
> other attached packages:
> [1] abind_1.4-0       rJava_0.9-3       R.utils_1.12.1    R.oo_1.9.3
> R.methodsS3_1.2.2
>
> loaded via a namespace (and not attached):
> [1] codetools_0.2-8 tcltk_2.14.0    tools_2.14.0
>
>
>
> I compiled R configure as follows
> /configure --prefix=/usr/local/R --enable-byte-compiled-packages=no
> --with-tcltk --enable-R-shlib=yes
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list