[R] Using readLines on a file without resetting internal file offset parameter?

Thomas Nyberg tomnyberg at gmail.com
Wed Oct 29 17:51:06 CET 2014


Thanks for the response! I'd rather keep the file open than close it, 
since it would flush the internal buffer. The whole reason I'm doing 
this is to take advantage of the buffering and closing it would defeat 
the purpose.

I actually just found a solution which is to open the files with the "r" 
flag explicitly. I.e. the following is what I want.

-----

bash $ echo 1 > testfile
bash $ echo 2 >> testfile
bash $ cat testfile
1
2

bash $ R
R > f <- file('testfile', 'r')
R > readLines(f, n = 1)
[1] "1"
R > readLines(f, n = 1)
[1] "2"
R > readLines(f, n = 1)
character(0)

-----

If you want to use writeLines in this same fashion you'll also need to 
open the original file with the "w" as well.

It's very odd that file('filename') will let you read from it, but will 
not act the same as file('filename', 'r') when it comes to readLines. Is 
this a bug or is there some reasoning behind this? Regardless, it's 
certainly extremely unintuitive.

Thanks again for the response!

Cheers,
Thomas

On 10/29/2014 12:22 PM, William Dunlap wrote:
> Open your file object before calling readLines and close it when you
> are done with
> a sequence of calls to readLines.
>
>    > tf <- tempfile()
>    > cat(sep="\n", letters[1:10], file=tf)
>    > f <- file(tf)
>    > open(f)
>    > # or f <- file(tf, "r") instead of previous 2 lines
>    > readLines(f, n=1)
>    [1] "a"
>    > readLines(f, n=1)
>    [1] "b"
>    > readLines(f, n=2)
>    [1] "c" "d"
>    > close(f)
>
> I/O operations on an unopened connection generally open it, do the operation,
> then close it.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Oct 29, 2014 at 8:23 AM, Thomas Nyberg <tomnyberg at gmail.com> wrote:
>> Hi everyone,
>>
>> I would like to read a file line by line, but I would rather not load all
>> lines into memory first. I've tried using readLines with n = 1, but that
>> seems to reset the internal file descriptor's file offset after each call.
>> I.e. this is the current behavior:
>>
>> -------
>>
>> bash $ echo 1 > testfile
>> bash $ echo 2 >> testfile
>> bash $ cat testfile
>> 1
>> 2
>>
>> bash > R
>> R > f <- file('testfile')
>> R > readLines(f, n = 1)
>> [1] "1"
>> R > readLines(f, n = 1)
>> [1] "1"
>>
>> -------
>>
>> I would like the behavior to be:
>>
>> -------
>>
>> bash > R
>> R > f <- file('testfile')
>> R > readLines(f, n = 1)
>> [1] "1"
>> R > readLines(f, n = 1)
>> [1] "2"
>>
>> -------
>>
>> I'm coming to R from a python background, where the default behavior is
>> exactly the opposite. I.e. when you read a line from a file it is your
>> responsibility to use seek explicitly to get back to the original position
>> in the file (this is rarely necessary though). Is there some flag to turn
>> off the default behavior of resetting the file offset in R?
>>
>> Cheers,
>> Thomas
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list