[R] how to skip last lines while reading the data in R

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jan 28 13:59:58 CET 2008


On Mon, 28 Jan 2008, Barry Rowlingson wrote:

> Henrique Dallazuanna wrote:
>> Perhaps:
>>
>> data <- read.table(textConnection(rev(rev(readLines('data.txt'))[-(1:2)])))
>>
>
>  Euurgh! Am I the only one whose sense of aesthetics is enraged by
> this? To get rid of the last two items you reverse the vector, remove
> the first two items, then reverse the vector again?
>
>  One liners are fine for R Golf games, but in the real world, I'd get
> the length of the vector and cut directly off the end. Consider these:
>
> # reverse/trim/reverse:
> rev1 <- function(x,n=100,m=5){
>   for(i in 1:n){
>     y=rev(rev(x)[-(1:m)])
>   }
>   return(y)
> }
>
> # get length, trim
> rev2 <- function(x,n=100,m=5){
>   for(i in 1:n){
>     y=x[1:(length(x)-m)]
>   }
>   return(y)
> }
>
>  > system.time(rev1(1:1000,10000,5))
>  [1] 1.864 0.008 2.044 0.000 0.000
>  > system.time(rev2(1:1000,10000,5))
>  [1] 0.384 0.008 0.421 0.000 0.000
>
>
>  Result: faster, more directly readable code.

And if you know the file size, just use

read.table('data.txt', nrows=<#file_rows>-2)

(and wc -l will tell you the number of rows more efficiently that using a 
text connection: if you must use a temporary home use file(), no 
arguments, as that is much more efficient).

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list