[Rd] row.names(data.frame(matrixWithDimnames)) depends on first rowname being "" or not. (PR#13230)

wdunlap at tibco.com wdunlap at tibco.com
Thu Oct 30 05:00:07 CET 2008


Full_Name: Bill Dunlap
Version: R version 2.9.0 Under development (unstable) (2008-10-29 r46795) 
OS: Linux
Submission from: (NULL) (76.28.245.14)


When data.frame() is given a matrix with rownames, then the type of the output
row names depends on whether the first element of the input row names is "" or
not.   The other elements of the input row names don't affect things.  E.g.,

> data.frame(matrix(1:6, nrow=3, ncol=2, dimnames=list(c("","Row 2","Row 3"),
paste("Col",1:2))))
  Col.1 Col.2
1     1     4
2     2     5
3     3     6
> data.frame(matrix(1:6, nrow=3, ncol=2, dimnames=list(c("Row 1","","Row 3"),
paste("Col",1:2))))
      Col.1 Col.2
Row 1     1     4
          2     5
Row 3     3     6

I noticed this when converting a table of word counts (by speaker) into a
data.frame and the word "" came first in the collating sequence so the words did
not become the row names of the output.  If the "" was not first in the table
then the row names of the input were carried into the output.

I haven't had the time yet to make a fix for this, but the distinction between
row.names[1] != or == "" comes from code in data.frame() itself (not
as.data.frame.matrix):

     81         if (missing(row.names) && nrows[i] > 0L) {
     82             rowsi <- attr(xi, "row.names")
     83             if (!(rowsi[[1L]] %in% ""))
     84                 row.names <- data.row.names(row.names, rowsi,
     85                   i)
     86         }

Why is that check there?

Bill Dunlap
TIBCO Spotfire
wdunlap tibco.com



More information about the R-devel mailing list