R-alpha: Re^3: data file names

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Wed, 3 Dec 1997 09:00:11 +0100

>>>>> "KH" == Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at> writes:

>>>>> Robert Gentleman writes:
    >>> In preparing the next Windows
    >>> release I want to make opening up system data files (and their
    >>> documentation) more transparent. I would really like to adopt
    >>> the convention that data files use the suffix .rdf (.dat seems
    >>> like it's taken). This will make it easier to get the builtin
    >>> Windows file browsers to work on it and save me some
    >>> considerable work trying to figure out what is and isn't a data
    >>> file.

    MM> But why can't it just be `.R' (-> `.r' in the ...OS)?
    MM> At least currently, these are simply files with valid R code.  If
    MM> they would end in `.R', ESS (Emacs Speaks Statistics) would
    MM> automatically be put in the proper mode.  Of course, we could also
    MM> use a new ending, but I think this would only make sense if the
    MM> data files would internally use a different format, e.g., data.dump
    MM> / data.restore, or one which works with read.table.

    >> I agree with Martin. I would be happy with .R initially and later we
    >> could add other suffixes if we have different data formats. I was
    >> just going to slurp them in through source.

    KH> Shall we try to be consistent about other endings, too?  E.g.,

Yes, why not agree on this, now, even before implementation!
    KH> rdf	for R data file/format (data.dump/data.restore)

data.dump/data.restore  is hopefully going to be  S-plus compatible as much
as possible (that's why most of us would want it).
Hence, I think the `R' is really too much there.

Would just

	dmp	still be `free' (in Windoze I mean) ?

    KH> table	for tables

For portability, we'd probably rather need 3 letter expansions...
	tab	for tables (to be readable by read.table ?!)
		Note, we have to agree if read.table will be called
		with  (...., header = TRUE)  or not.

		Here, mostly for teaching purposes, we have (almost) all
		our data files in the format

			nam.1  col.2  var.3 ...  var.<p>
			<x11>   <x12>  <x13> ...  <x1p>
			<x21>   <x22>  <x23> ...  <x2p>
			  ..      ..    ..         ..
			  ..      ..    ..         ..
			  ..      ..    ..         ..

		I.e. we don't have `rownames' most of the time and don't
		want to type '1' '2' ... 'n' in front of the variables.


	Maybe we should even have

	tab	for the above		 ->  read.table(<file>, header = T)
	tbr	for tables with rownames ->  read.table(<file>)

			col.1  col.2  var.3 ...  var.<p>
		r1	<x11>   <x12>  <x13> ...  <x1p>
		r2	<x21>   <x22>  <x23> ...  <x2p>
		..	  ..      ..    ..         ..
		..	  ..      ..    ..         ..
		..	  ..      ..    ..         ..

r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch