[R] Can scan() detect end-of-file?

William Dunlap wdunlap at tibco.com
Fri Oct 16 00:10:31 CEST 2015


C can tell when it hits the end of input.  Reading the lines with
readLines and passing them to scan() does not help - it is the
same as having scan read the original file.

My problem is that the file (or other connection) has a variable number
of fields on each "line", and perhaps no fields on some lines.  Fields
enclosed in quotes may include newline character.  I want to read this
file into a list of character vectors, the n'th element of the list being
the fields on the n'th "line" of the file.

repeating scan(connection, nlines=1, what="") does everything right
except for telling me when it has read everything the connection
has to offer.  scan(connection, what="") manages to figure out where
the end of the file is, but does not tell me the line number associated
each character string.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Oct 15, 2015 at 2:57 PM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:
> This is a problem in C as well... and the solution is to read the lines yourself and then give those lines to scan.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On October 15, 2015 1:16:58 PM PDT, William Dunlap <wdunlap at tibco.com> wrote:
>>I would like to read a connection line by line with scan but
>>don't know how to tell when to quit trying.  Is there any
>>way that you can ask the connection object if it is at the end?
>>
>>E.g.,
>>
>>t <- 'A "Two line\nentry"\n\n"Three\nline\nentry" D E\n'
>>tfile <- tempfile()
>>cat(t, file=tfile)
>>tcon <- file(tfile, "r") # or tcon <- textConnection(t)
>>scan(tcon, what="", nlines=1)
>>#Read 2 items
>>#[1] "A"               "Two line\nentry"
>>> scan(tcon, what="", nlines=1)  # empty line
>>#Read 0 items
>>#character(0)
>>scan(tcon, what="", nlines=1)
>>#Read 3 items
>>#[1] "Three\nline\nentry" "D"                  "E"
>>scan(tcon, what="", nlines=1) # end of file
>>#Read 0 items
>>#character(0)
>>scan(tcon, what="", nlines=1) # end of file
>>#Read 0 items
>>#character(0)
>>
>>I am reading virtual line by virtual line because the lines
>>may have different numbers of fields.
>>
>>Bill Dunlap
>>TIBCO Software
>>wdunlap tibco.com
>>
>>______________________________________________
>>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list