An idea for something better than read.table

Kurt Hornik
Thu, 11 Feb 1999 19:10:27 +0100 (CET)

>>>>> Peter Dalgaard BSA writes:

> I was recently converting some datasets for use in an R package and it
> occurred to me that there really is no "neat" way to input a data
> frame if it is to contain factor variables. 

> One can use dput()/source or dump() after massaging data into the
> right format, of course, but there isn't really anything which allows
> you to store the input instructions with the data beyond the simple
> header=T type format. 

> So I thought of ways to enhance the header. The best idea I've been
> able to come up with this far is to 

> (a) Write a function - basically an extension of scan() - which allows
>     you to specify the column data type in more detail. Let's call it
>     data.file() for now. It would pretty much have to deparse all of
>     its arguments and interpret things in slightly unusual ways, but R
>     can do that, and some of functions (notably help() and data())
>     already play this kind of game with the parser...

> (b) Have a function, say read(), which parses the 1st expression in a
>     file and executes it *with the remainder of the file as the
>     argument*. (Currently, this is impossible, but it would be if
>     one just kept track of the line number while parsing. parse()
>     could stick it on as an attribute of the parsed expression list if
>     asked to do so.)

> This would make a file format something like the following possible.

> [There's another loose idea in there involving a control item to handle
> separators, na.strings, etc. - the intention being that read() plugs
> in the file= and skip= arguments for the actual call.]

> Would this be an approach worth pursuing?

I think so.  However, why can't we extend scan() accordingly?


       what = list(Item = factor(levels=1:4,labels=c("A","B","C","D")),
                   Size = numeric(),
                   Year = factor(levels=1980:1985)))


r-devel mailing list -- Read
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: