[R] R tools for large files

Murray Jorgensen maj at stats.waikato.ac.nz
Mon Aug 25 07:16:30 CEST 2003


This is no doubt true, but some things in R work very well with big 
files without the need for any extra software:

# prints out the first 12 lines as strings

flows <- read.csv(“c:/data/perry/data.csv”,na.strings=”?”, 
# makes a data frame from the first 1000 records

I would like to get some solution where I don't find myself generating 
large numbers of derived files from the original data file.


Andrew C. Ward wrote:
> Dear Murray,
> One way that works very well for many people (including me)
> is to store the data in an external database, such as MySQL,
> and read in just the bits you want using the excellent
> package RODBC. Getting a database to do all the selecting
> is very fast and efficient, leaving R to concentrate on the
> analysis and visualisation. This is all described in the
> R Import/Export Manual.
> Regards,
> Andrew C. Ward
> CAPE Centre
> Department of Chemical Engineering
> The University of Queensland
> Brisbane Qld 4072 Australia
> andreww at cheque.uq.edu.au
> Quoting Murray Jorgensen <maj at stats.waikato.ac.nz>:
>>I'm wondering if anyone has written some functions or
>>code for handling 
>>very large files in R. I am working with a data file that
>>is 41 
>>variables times who knows how many observations making up
>>27MB altogether.
>>The sort of thing that I am thinking of having R do is
>>- count the number of lines in a file
>>- form a data frame by selecting all cases whose line
>>numbers are in a 
>>supplied vector (which could be used to extract random
>>subfiles of 
>>particular sizes)
>>Does anyone know of a package that might be useful for
>>Dr Murray Jorgensen     
>>Department of Statistics, University of Waikato,
>>Hamilton, New Zealand
>>Email: maj at waikato.ac.nz                               
>>Fax 7 838 4155
>>Phone  +64 7 838 4773 wk    +64 7 849 6486 home    Mobile
>>021 1395 862
>>R-help at stat.math.ethz.ch mailing list

Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz                                Fax 7 838 4155
Phone  +64 7 838 4773 wk    +64 7 849 6486 home    Mobile 021 1395 862

More information about the R-help mailing list