[Rd] feature request: comment character in read.table?

Ben Bolker bolker@zoo.ufl.edu
Thu, 13 Sep 2001 12:58:58 -0400 (EDT)

  It was pretty straightforward.
  I haven't tested it very extensively, and not on a Mac at all.  The only
subtlety is skipping commented lines at the beginning of the file because
read.table assumes that the first line (not the first non-blank line) is
the header line if header=TRUE.

read.table.c <- function(file,comment="#",debug=FALSE,...) {
  infile <- file(file,"r")
  tmpfile <- file()
  cchar <- "#"
  while (length(cline <- readLines(infile,1))>0) {
    s <- strsplit(cline,cchar)[[1]][1]
    if (nchar(s)>0)  { ## skip blank lines to not screw up header
      if (debug) cat(s,"\n")
  r <- read.table(tmpfile,...)

On Wed, 12 Sep 2001, Prof Brian D Ripley wrote:

> On Wed, 12 Sep 2001, Ben Bolker wrote:
> >
> >   How difficult would it be (I could try myself if someone thought it
> > would be straightforward) to change read.table to allow a comment
> > character such as # or %?  My thought would be that anything on a line
> It's hard in read.table, especially given the changes in R-devle to mke it
> more flexible.  The place to do this seems to me to be the internals of
> scan.  They are far from transparent, though.
> > following a comment character would be ignored (so that the combination of
> > blank.lines.skip=TRUE and a comment at the beginning of the line would
> > lead to a line being skipped completely).
> >  I'm always encouraging my students to comment their data sets, and it
> > feels lame to tell them they have to count the number of initial lines in
> > the data set in order to set the "skip" parameter appropriately.  (The
> > comment character would also allow comments about a particular data
> > point.)
> >
> >   I know I could hack this (a) with sed in Unix [but my students using
> > Windows are likely to have trouble] (b) within R, by processing the file
> > and creating a temporary file with comments deleted.  (b) is probably what
> > I'll do as a temporary fix, but this seems to be a reasonable piece of
> > functionality for R to have ...
> (b) seems easy to me. Use either a file() connection or an output text
> connection.  (I don't know if file(), that's no arg, works on Mac for
> example, though.)

318 Carr Hall                                bolker@zoo.ufl.edu
Zoology Department, University of Florida    http://www.zoo.ufl.edu/bolker
Box 118525                                   (ph)  352-392-5697
Gainesville, FL 32611-8525                   (fax) 352-392-3704

r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch