[R] memory, i am getting mad in reading climate data

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Mar 18 08:29:34 CET 2012


On 17/03/2012 20:42, jim holtman wrote:
> Another suggestion is to start with a subset of the data file to see
> how much memory is required for your processing.  One of the
> misconceptions is that "memory is free".  People think that with
> virtual memory and other such tools, that there is no restriction on
> what you can do.  Instead of starting off, and getting "mad", when
> trying to read in all your data, try a small subset and then look at
> the memory usage of the R process.  I would assume that you are
> running on a 32-bit Windows system, which if you are lucky, can
> address 3GB for a user process.  My rule of thumb is that the largest
> object I can work with is 30% of the real memory I have available, so
> for my Windows system which lets me address almost 3GB, the biggest
> object I would attempt to work with is 1GB,

But that's because your real limit is address space, not the amount of 
memory you have.  Only people with 32-bit OSes have that problem, and 
they are rapidly going obsolete.  Some of us have been using 64-bit OSes 
for more than a decade (even with 1GB of RAM).

> Working with a subset, you would understand how much memory an XXXMB
> file might require.  This would then give you an idea of what the
> maximum size file you might be able to process.
>
> Every system has limits.  If you have lots of money, then invest in a
> 64-bit system with 100GB of real memory and you probably won't hit its

Actually, you only need a pretty modest amount of money for that, and an 
extremely modest amount for a 64-bit system with 8GB RAM (which will 
give you maybe 5x more usable memory than Jim's system).

Just about any desktop computer made in the last 5 years can be upgraded 
to that sort of spec (and our IT staff did so for lots of our machines 
in mid-2010).

> limits for a while.  Otherwise, look at taking incremental steps and
> possibly determining if you can partition the data.  You might
> consider a relational database to sotre the data so that it is easier
> to select a subset of data to process.

But in this case, netCDF is itself a sort of database system with lots 
of facilities to select subsets of the data.

>
>
>
> 2012/3/17 Uwe Ligges<ligges at statistik.tu-dortmund.de>:
>>
>>
>> On 17.03.2012 19:27, David Winsemius wrote:
>>>
>>>
>>> On Mar 17, 2012, at 10:33 AM, Amen wrote:
>>>
>>>> I faced this problem when typing:
>>>>
>>>> temperature<- get.var.ncdf( ex.nc, 'Temperature' )
>>>>
>>>> *unable to allocate a vector of size 2.8 GB*
>>>
>>>
>>> Read the R-Win-FAQ
>>
>>>
>>>>
>>>>
>>>> By the way my computer memory is 4G and the original size of the file is
>>>> 1.4G,netcdf file
>>
>>
>> ... and reading / storing the data in memory may require much more than
>> 4GB...
>>
>> Uwe Ligges
>>
>>
>>>> I don't know what is the problem.Any suggestion please
>>>> I tried also
>>>> memory limit(4000)
>>>> 4000
>>>> but didnt solve the problem.any help



-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list