[Rd] (PR#8192) [ subscripting sometimes loses names

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sat Jan 31 22:36:45 CET 2009

Christian Brechbühler wrote:

>>> data.frame(val=1:3,row.names=letters[1:3])[,1]
>> [1] 1 2 3
>> but it's not obvious that the result should be named using the row.names
>> and (in particular) whether or why it should differ from .....[[1]] and
>> ....$val. 

this might be a good argument, if not that [,1] returning a vector
rather than a one-column data frame is already inconsistent (with
[,1:2], for example).  if [,1] were not dropping the data.frame class
and were returning a data frame instead, it would be obvious the result
should use row names. 


will keep the class and row names, though ?'[' says "drop: For matrices
and arrays.".

it doesn't mean that dropping row names (or dropping dimensions) isn't
useful and handy in specific cases, but this makes it no less

>> Given that for most purposes, extracting the relevant names would
>> just be unnecessary red tape, I'd say that we can do without it.
> Compare
>> data.frame(val=1:3,row.names=letters[1:3])[,1]
> [1] 1 2 3
>> as.matrix(data.frame(val=1:3,row.names=letters[1:3]))[,1]
> a b c
> 1 2 3
> X[,1] preserves row names if X is a matrix, and loses them if X is a data
> frame.  To me, this is ugly and inconsistent.
> One might argue that having names and dimnames at all is "red tape", and
> wastes memory and computational efficiency -- after all, Fortran arrays had
> no names.  But R chose to drag along the names (sometimes), and it can be
> very helpful to us humans.  Now R should do it consistently.

i support this opinion.  whether to have or not to have row names is a
design decision, and both options may be reasonably argued for and
against.  but lack of consistency is seldom any good;  r consistently
lacks consistency.


More information about the R-devel mailing list