[Rd] A couple of issues with colClasses/setAs

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Sep 8 12:09:07 CEST 2004

Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:

> >From ?read.table (this is about read.table, despite the subject line, I 
> believe?)
> colClasses: character.  A vector of classes to be assumed for the columns.
> "NULL" is not a class in my book (and certainly not one a column can
> have).  So no wonder it does not work, and it is not a bug not to work in
> undocumented cases.

> class(NULL)
[1] "NULL"

But you're right, I sort of assumed that NULL would work, and guessed
the syntax. 
> We can look into making it work, but once you start skipping columns I 
> think you should be using scan().  (I also suspect scan did not accept 
> NULL when this was implemented.)

One-stop shopping would be a good target here I think; it would be
good if we can teach read.table to skip columns while retaining all
the other niceties. 

The reason that it appeared to work sometimes is in this loop

    for (i in 1:cols) {
        if (known[i])
        data[[i]] <- if (!is.na(colClasses[i]))
            as(data[[i]], colClasses[i])
        else type.convert(data[[i]], as.is = as.is[i], dec = dec,
            na.strings = character(0))

which sets data[[i]] <- NULL when colClasses[i] is "NULL". Had this
been intentional, it would of course be horribly wrong, since it makes
the index of all subsequent columns decrease by 1...

To make it actually work, we should probably fixup the "what" that is
being passed to scan a bit further upstreams.

> Might be a good idea to teach colClasses about "factor".

That's what I thought. Other ideas would be to predefine some standard
date classes (it's a bit annoying that there's no way to give
auxiliary information like formats), and maybe to allow a second
header line containing class names.

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

More information about the R-devel mailing list