[R] reading a big file

Charles C. Berry cberry at tajo.ucsd.edu
Thu May 24 20:30:53 CEST 2007


On Thu, 24 May 2007, Christoph Scherber wrote:

> Dear Remigijus,
>
> You should change memory allocation in Windows XP, as described in
>
> http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021


Porbably, this will not solve the problem as the object to be created will 
need 400 MB and scan() will require memory to create that object. Not to 
mention that the OS will consume a chunk of RAM.

>
> Hope this helps.
>
> Best wishes
> Christoph
>
>
> --
> Christoph Scherber
> DNPW, Agroecology
> University of Goettingen
> Waldweg 26
> D-37073 Goettingen
>
> +49-(0)551-39-8807
>
>
>
>
> Remigijus Lapinskas schrieb:
>> Dear All,
>>
>> I am on WindowsXP with 512 MB of RAM, R 2.4.0, and I want to read in a
>> big file mln100.txt. The file is 390MB big, it contains a column of 100
>> millions integers.
>>
>>> mln100=scan("mln100.txt")
>> Error: cannot allocate vector of size 512000 Kb
>> In addition: Warning messages:
>> 1: Reached total allocation of 511Mb: see help(memory.size)
>> 2: Reached total allocation of 511Mb: see help(memory.size)
>>
>> In fact, I would be quite happy if I could read, say, every tenth
>> integer (line) of the file. Is it possible to do this?
>>

To save out the first, eleventh, etc:

mln.con <- file("tmp.txt",open="r")
res <- rep(0,10)
for (i in 1:10 ) res[i] <- as.integer( readLines( mln.con ,n = 10 )[1] )


>> Cheers,
>> Rem
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> .
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901



More information about the R-help mailing list