[R] Reading in 9.6GB .DAT File - OK with 64-bit R?

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Fri Mar 9 00:44:44 CET 2012


On Thu, Mar 8, 2012 at 6:19 PM, RHelpPlease <rrumple at trghcsolutions.com> wrote:
> Hi there,
> I wish to read a 9.6GB .DAT file into R (64-bit R on 64-bit Windows machine)
> - to then delete a substantial number of rows & then convert to a .csv file.
> Upon the first attempt the computer crashed (at some point last night).

 If you are trying to delete a substantial number of rows as a one-off
operation to get a small dataset, then you might be better filtering
it with a tool like perl, awk, or sed - something that reads a line at
a time, processes it, and then perhaps writes a line output.

 For example, suppose you only want lines where the 25th character in
each line is an 'X'. Then all you do is:

 awk 'substr($0,25,1)=="X"' < bigfile.dat >justX.dat

Here I've used awk to filter input based on a condition. It never
reads in the whole file so memory usage isn't a problem.

 Awk for windows is available, possibly as a native version or as part
of Cygwin.

 You could do a similar thing in R by opening a text connection to
your file and reading one line at a time, writing the modified or
selected lines to a new file.

Barry



More information about the R-help mailing list