[R] Reading data with 'awk' - basics?

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Oct 17 15:40:30 CEST 2011


On Mon, 17 Oct 2011, Brian Smith wrote:

> Hi,
>
> I had a large file for which I require a subset of rows. Instead of reading
> it all into memory, I use the awk command to get the relevant rows. However,
> I'm doing it pretty inefficiently as I write the subset to disk, before
> reading it into R. Is there a way that I can read it into an R object
> without writing to disk? For example, this is what I do currently:
>
> ## write test sample file
> mat1 <- matrix(sample(1:100,16),8,2)
> fname1 <- 'temp1.txt'
> fname2 <- 'temp2.txt'
> write.table(mat1,fname1,sep='\t',row.names=F,col.names=F)
>
> ## Read a subset of rows, write to file, and read from file
> system(paste("awk '(NR > 1 && NR < 4) {print $0}' ",fname1," >
> ",fname2,sep=''))
> mat2 <- read.table(fname2,sep='\t')
>
> print(mat2)
> #####
>
> Is there a way that I can skip writing to disk?

Use a pipe() connection.

>
> thanks!
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list