[Rd] [R] rownames, colnames, and date and time

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Mar 29 22:26:10 CEST 2006


Looking at the code it occurs to me that there is another case you have 
not considered, namely dimnames().

rownames<- and colnames<- are just wrappers for dimnames<-, so consistency 
does mean that all three should behave the same.

For arrays (including matrices), dimnames<- is primitive.  It coerces 
factors to character, and says in the C code

     /* if (isObject(val1)) dispatch on as.character.foo, but we don't
        have the context at this point to do so */

so someone considered this before now.

For data frames, dimnames<-.data.frame is used.  That calls row.names<- 
and names<-, and the first has a data.frame method.  Only the row.names<- 
method is documented to coerce its value to character, and I think it _is_ 
all quite consistent.  The basic rule is that all these functions coerce 
for data frames, and none do for arrays.

However, there was a problematic assumption in the row.names<-.data.frame 
and dimnames<-.data.frame methods, which tested the length of 'value' 
before coercion.  That sounds reasonable, but in unusual cases such as 
POSIXlt, coercion changes the length, and I have swapped the lines around.

What you expected was that dimnames<-() would coerce to character, 
although I can find no support for that expectation in the documentation. 
If it were not a primitive function that would be easy to achieve, but as 
it is, it would need an expert in the internal code to change.  There is 
also the risk of inconsistency, since as the comment says, the C code is 
used in places where the context is not known.  I think this is probably 
best left alone.


On Wed, 29 Mar 2006, Prof Brian Ripley wrote:

> Yet again, this is the wrong list for suggesting changes to R.  Please do use 
> R-devel for that purpose (and I have moved this).
>
> If this bothers you (it all works as documented, so why not use it as 
> documented?), please supply a suitable patch to the current R-devel sources 
> and it will be considered.
>
> And BTW, row.names is the canonical accessor function for data frames,
> and its 'value' argument is documented differently from that for rownames for 
> an array.  Cf:
>
> Details:
>
>     The extractor functions try to do something sensible for any
>     matrix-like object 'x'.  If the object has 'dimnames' the first
>     component is used as the row names, and the second component (if
>     any) is used for the col names.  For a data frame, 'rownames' and
>     'colnames' are equivalent to 'row.names' and 'names' respectively.
>
> Note:
>
>     'row.names' is similar to 'rownames' for arrays, and it has a
>     method that calls 'rownames' for an array argument.
>
> I am not sure why R decided to add rownames for the same purpose as 
> row.names: eventually they were made equivalent.
>
>
> On Tue, 21 Mar 2006, Erich Neuwirth wrote:
>
>> I noticed something surprising (in R 2.2.1 on WinXP)
>> According to the documentation, rownames and colnames are character 
>> vectors.
>> Assigning a vector of class POSIXct or POSIXlt as rownames or colnames
>> therefore is not strictly according to the rules.
>> In some cases, R performs a reasonable typecast, but in some other cases
>> where the same typecast also would be possible, it does not.
>> 
>> Assigning a vector of class POSIXct to the rownames or names of a
>> dataframe creates a reasonable string representation of the dates (and
>> possibly times).
>> Assigning such a vector to the rownames or colnames of a matrix produces
>> rownames or colnames consisting of the integer representation of the
>> date-time value.
>> Trying to assign a vector of class POSIXlt in all cases
>> (dataframes and matrices, rownames, colnames, names)
>> produces an error.
>> 
>> Demonstration code is given below.
>> 
>> This is somewhat inconsistent.
>> Perhaps a reasonable solution could be that the typecast
>> used for POSIXct and dataframes is used in all the other cases also.
>> 
>> Code:
>> 
>> mymat<-matrix(1:4,nrow=2,ncol=2)
>> mydf<-data.frame(mymat)
>> mydates<-as.POSIXct(c("2001-1-24","2005-12-25"))
>> 
>> rownames(mydf)<-mydates
>> names(mydf)<-mydates
>> rownames(mymat)<-mydates
>> colnames(mymat)<-mydates
>> 
>> print(deparse(mydates))
>> print(deparse(rownames(mydf)))
>> print(deparse(names(mydf)))
>> print(deparse(rownames(mymat)))
>> print(deparse(colnames(mymat)))
>> 
>> mydates1<-as.POSIXlt(mydates)
>> 
>> # the following lines will not work and
>> # produce errors
>> 
>> rownames(mydf)<-mydates1
>> names(mydf)<-mydates1
>> rownames(mymat)<-mydates1
>> colnames(mymat)<-mydates1
>> 
>> 
>> 
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list