data frames with non-unique row.names (PR#98)

Martin Maechler Martin Maechler <>
Thu, 21 Jan 1999 18:15:04 +0100

My conclusions from the (two !) reactions on my posting of yesterday:

    MM> In R and S, the general idea is that data.frames must have unique
    MM> row.names (aka dimnames(.)[[1]]).

    MM> Several observations / problems (in R *and* S !).

    MM> 	[Example code at the end]

    MM> 1) Both in S and R,

    MM> 	  data.frame(..)

    MM>   (and e.g., also cbind(<data.frame>, ..)  which dispatches to
    MM> data.frame()) silently drops the whole row.names and replaces it by
    MM> "1" "2" ...  if the names would be non-unique.

    MM>  PROPOSITION 1: I have the feeling I'd want to get a warning in
    MM> that case.  However, you may prove me wrong...

I'll introduce this warning.. in the current source, 
[if I still don't hear reasons against it..]

    MM> 2) Now, in S (but not in R), the "row.names<-" function gives an
    MM> error if you try to assign non-unique row.names.

    MM>    This is as desired (and R should do the same).

    MM>    (== BUG REPORT for R )

This is already in tomorrow's R-release patches
    MM> 3) However, I can still (both in S-plus 3.4 & 5.0r2) do attr(dat,
    MM> "row.names") <- <nonunique character>
    MM>    and get a resulting data.frame dat with non-unique row.names.

    MM>  PROPOSITION 2: I think I want to make sure that a(the same?) error
    MM> message as in "2)" is generated in this case.

no, we won't do this.
Apparently, S-plus 5.0 now even provides an explicit argument to
data.frame() allowing for non-unique  row.names
(in order to  bootstrap() or quickly read.table()  large data.frames).

Martin Maechler <>
Seminar fuer Statistik, ETH-Zentrum SOL G1;	Sonneggstr.33
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1086			<><
r-devel mailing list -- Read
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: