[R] Can scan() detect end-of-file?

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Fri Oct 16 00:56:55 CEST 2015


I don't know what OS-independent function you use in C that performs the
way you describe. I would write the below function in C myself in order to get 
this functionality in that language.

readListOfVectors <- function( input ) {
  lines <- readLines( input )
  if ( "" == lines[ length( lines ) ] ) {
   lines <- lines[ -length( lines ) ]
  }
  result <- lapply( lines
                  , function( lin ) {
                     lc <- textConnection( lin )
                     res <- scan( lc, quiet=TRUE )
                     close( lc )
  		    res
 		   }
                  )
  result
}

# test
txt <- (
"1 2 3 4
1 4 5
2 4 6 8 9
")

tc <- textConnection(txt)
# can give it a filename or a connection object
readListOfVectors( tc )
close(tc)

On Thu, 15 Oct 2015, William Dunlap wrote:

> C can tell when it hits the end of input.  Reading the lines with
> readLines and passing them to scan() does not help - it is the
> same as having scan read the original file.
>
> My problem is that the file (or other connection) has a variable number
> of fields on each "line", and perhaps no fields on some lines.  Fields
> enclosed in quotes may include newline character.  I want to read this
> file into a list of character vectors, the n'th element of the list being
> the fields on the n'th "line" of the file.
>
> repeating scan(connection, nlines=1, what="") does everything right
> except for telling me when it has read everything the connection
> has to offer.  scan(connection, what="") manages to figure out where
> the end of the file is, but does not tell me the line number associated
> each character string.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Thu, Oct 15, 2015 at 2:57 PM, Jeff Newmiller
> <jdnewmil at dcn.davis.ca.us> wrote:
>> This is a problem in C as well... and the solution is to read the lines yourself and then give those lines to scan.
>> ---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>>                                       Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
>> ---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> On October 15, 2015 1:16:58 PM PDT, William Dunlap <wdunlap at tibco.com> wrote:
>>> I would like to read a connection line by line with scan but
>>> don't know how to tell when to quit trying.  Is there any
>>> way that you can ask the connection object if it is at the end?
>>>
>>> E.g.,
>>>
>>> t <- 'A "Two line\nentry"\n\n"Three\nline\nentry" D E\n'
>>> tfile <- tempfile()
>>> cat(t, file=tfile)
>>> tcon <- file(tfile, "r") # or tcon <- textConnection(t)
>>> scan(tcon, what="", nlines=1)
>>> #Read 2 items
>>> #[1] "A"               "Two line\nentry"
>>>> scan(tcon, what="", nlines=1)  # empty line
>>> #Read 0 items
>>> #character(0)
>>> scan(tcon, what="", nlines=1)
>>> #Read 3 items
>>> #[1] "Three\nline\nentry" "D"                  "E"
>>> scan(tcon, what="", nlines=1) # end of file
>>> #Read 0 items
>>> #character(0)
>>> scan(tcon, what="", nlines=1) # end of file
>>> #Read 0 items
>>> #character(0)
>>>
>>> I am reading virtual line by virtual line because the lines
>>> may have different numbers of fields.
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list