[R] readBin fails to read large files

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 1 18:41:49 CEST 2011


On Thu, 1 Sep 2011, Prof Brian Ripley wrote:

> readBin is intended to read a few items at a time, not 10^9.  You are 
> probably getting 32-bit integer overflow inside your OS, since the number of 
> bytes you are trying to read in one go exceeds 2GB.
>
> Don't do that: read say a million at time.
>
> And BTW, if these really are unsigned ints you will get wraparound.

It seems someone did not read the help:

   signed: logical.  Only used for integers of sizes 1 and 2, when it
           determines if the quantity on file should be regarded as a
           signed or unsigned integer.

and you are using size = 4 implicitly this will be ignored.

>
> On Thu, 1 Sep 2011, Benton, Paul wrote:
>
>> Posting for a friend
>> 
>> Begin forwarded message:
>> 
>> From: "Geier, Florian" 
>> <florian.geier08 at imperial.ac.uk<mailto:florian.geier08 at imperial.ac.uk>>
>> Subject: Fwd: readBin fails to read large files
>> Date: September 1, 2011 4:10:53 PM GMT+01:00
>> To:
>> 
>> 
>> 
>> Begin forwarded message:
>> 
>> Date: 1 September 2011 16:01:45 GMT+01:00
>> Subject: readBin fails to read large files
>> 
>> Dear all,
>> 
>> I am trying to read a large file (~2GB) of unsigned ints into R. Using the 
>> command:
>> 
>> raw<-readBin("file",n=10^8, integer(),endian="little",signed=FALSE)
>> 
>> It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My 
>> machine$sizeof.long is 8 bit.
>> I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) 
>> architecture.
>> 
>> Thanks for your help
>> 
>> Florian
>> 
>> --
>> AXA doctoral fellow
>> Bundy lab - Biomolecular Medicine
>> Imperial College London
>> 
>> 
>> 
>> 
>> 
>> --
>> AXA doctoral fellow
>> Bundy lab - Biomolecular Medicine
>> Imperial College London
>> 
>> 
>> 
>> 
>> 
>>
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list