[R] Error: cannot allocate vector of size...

jim holtman jholtman at gmail.com
Tue Nov 10 15:46:13 CET 2009


Check out:

http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg79590.html

for sampling a large file.

On Tue, Nov 10, 2009 at 8:32 AM, maiya <maja.zaloznik at gmail.com> wrote:
>
> OK, it's the simple math that's confusing me :)
>
> So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
> different?
>
> And lets say I could potentially live with e.g. 1/3 of the cases - that
> would make it .8GB, which should be fine? But then my question is if there
> is any way to sample the rows in read.table? Or what would be the best way
> of importing a random third of my cases?
>
> Thanks!
>
> M.
>
>
>
> jholtman wrote:
>>
>> A little simple math.  You have 3M rows with 100 items on each row.
>> If read in this would be 300M items.  If numeric, 8 bytes/item, this
>> is 2.4GB.  Given that you are probably using a 32 bit version of R,
>> you are probably out of luck.  A rule of thumb is that your largest
>> object should consume at most 25% of your memory since you will
>> probably be making copies as part of your processing.
>>
>> Given that, is you want to read in 100 variables at a time, I would
>> say your limit would be about 500K rows to be reasonable.  So you have
>> a choice; read in fewer rolls, read in all 3M rows but at 20 columns
>> per read, put the data in a database and extract what you need.
>> Unless you go to a 64-bit version of R you will probably not be able
>> to have the whole file in memory at one time.
>>
>> On Tue, Nov 10, 2009 at 7:10 AM, maiya <maja.zaloznik at gmail.com> wrote:
>>>
>>> I'm trying to import a table into R the file is about 700MB. Here's my
>>> first
>>> try:
>>>
>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>>
>>> Error: cannot allocate vector of size 15.6 Mb
>>> In addition: Warning messages:
>>> 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>  :
>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>> 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>  :
>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>> 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>  :
>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>> 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>  :
>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>>
>>> Then I tried
>>>
>>>> memory.limit(size=4095)
>>>  and got
>>>
>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>> Error: cannot allocate vector of size 11.3 Mb
>>>
>>> but no additional errors. Then optimistically to clear up the workspace:
>>>
>>>> rm()
>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>> Error: cannot allocate vector of size 15.6 Mb
>>>
>>> Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
>>> I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
>>> memory is usually 2Gb. Surely they mean GB?
>>>
>>> The file I'm importing has about 3 million cases with 100 variables that
>>> I
>>> want to crosstabulate each with each. Is this completely unrealistic?
>>>
>>> Thanks!
>>>
>>> Maja
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list