[Rd] Q: R 2.2.1: Memory Management Issues?

Simon Urbanek simon.urbanek at r-project.org
Fri Jan 6 01:12:33 CET 2006


Karen,

On Jan 5, 2006, at 5:18 PM, <Karen.Green at sanofi-aventis.com>  
<Karen.Green at sanofi-aventis.com> wrote:

> I am trying to run a R script which makes use of the MCLUST package.
> The script can successfully read in the approximately 17000 data  
> points ok, but then throws an error:
> --------------------------------------------------------
> Error:  cannot allocate vector of size 1115070Kb

This is 1.1GB of RAM to allocate alone for one vector(!). As you  
stated yourself the total upper limit is 2GB, so you cannot even fit  
two of those in memory anyway - not much you can do with it even if  
it is allocated.

> summary(EMclust(y),y)

I suspect that memory is your least problem. Did you even try to run  
EMclust on a small subsample? I suspect that if you did, you would  
figure out that what you are trying to do is not likely to terminate  
within days...

> (1) I had initially thought that Windows 2000 should be able to  
> allocate up to about 2 GB memory.  So, why is there a problem to  
> allocate a little over 1GB on a defragmented disk with over 15 GB  
> free?  (Is this a pagefile size issue?)

Because that is not the only 1GB vector that is allocated. Your "15GB/ 
defragmented" are irrelevant - if at all, look how much virtual  
memory is set up in you system's preferences.

> (2) Do you think the origin of the problem is
>     (a) the R environment, or
>     (b) the function in the MCLUST package using an in-memory  
> instead of an on-disk approach?

Well, a toy example of 17000x2 needs 2.3GB and it's unlikely to  
terminate anytime soon, so I'd rather call it shooting with the wrong  
gun. Maybe you should consider different approach to your problem -  
possibly ask at the BioConductor list, because people there have more  
experience with large data and this is not really a technical  
question about R, but rather how to apply statistical methods.

> (3)
>     (a) If the problem originates in the R environment, would  
> switching to the Linux version of R solve the problem?

Any reasonable unix will do - technically (64-bit versions  
preferably, but in your case even 32-bit would do). Again, I don't  
think memory is your only problem here, though.

Cheers,
Simon



More information about the R-devel mailing list