[Rd] row.names(data.frame(matrixWithDimnames)) depends on first (PR#13244)

ripley at stats.ox.ac.uk ripley at stats.ox.ac.uk
Mon Nov 3 13:45:07 CET 2008


On Thu, 30 Oct 2008, wdunlap at tibco.com wrote:

> Full_Name: Bill Dunlap
> Version: R version 2.9.0 Under development (unstable) (2008-10-29 r46795)
> OS: Linux
> Submission from: (NULL) (76.28.245.14)
>
>
> When data.frame() is given a matrix with rownames, then the type of the output
> row names depends on whether the first element of the input row names is "" or
> not.   The other elements of the input row names don't affect things.  E.g.,
>
>> data.frame(matrix(1:6, nrow=3, ncol=2, dimnames=list(c("","Row 2","Row 3"),
> paste("Col",1:2))))
>  Col.1 Col.2
> 1     1     4
> 2     2     5
> 3     3     6
>> data.frame(matrix(1:6, nrow=3, ncol=2, dimnames=list(c("Row 1","","Row 3"),
> paste("Col",1:2))))
>      Col.1 Col.2
> Row 1     1     4
>          2     5
> Row 3     3     6
>
> I noticed this when converting a table of word counts (by speaker) into a
> data.frame and the word "" came first in the collating sequence so the words did
> not become the row names of the output.  If the "" was not first in the table
> then the row names of the input were carried into the output.
>
> I haven't had the time yet to make a fix for this, but the distinction between
> row.names[1] != or == "" comes from code in data.frame() itself (not
> as.data.frame.matrix):
>
>     81         if (missing(row.names) && nrows[i] > 0L) {
>     82             rowsi <- attr(xi, "row.names")
>     83             if (!(rowsi[[1L]] %in% ""))
>     84                 row.names <- data.row.names(row.names, rowsi,
>     85                   i)
>     86         }
>
> Why is that check there?

Well, there is a comment in the sources,

 	if(missing(row.names) && nrows[i] > 0L) {
             rowsi <- attr(xi, "row.names")
             ## old way to mark optional names
             if(!(rowsi[[1L]] %in% ""))
                 row.names <- data.row.names(row.names, rowsi, i)
         }

which was last changed in Dec 2006.  However, the behaviour was much 
older.

It seems we can now change it to just test for some non-empty row name.

>
> Bill Dunlap
> TIBCO Spotfire
> wdunlap tibco.com
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list