[Rd] nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri May 17 11:39:31 CEST 2019


>>>>> Gabriel Becker 
>>>>>     on Fri, 17 May 2019 01:06:11 -0700 writes:

    > Hi Martin,
    > Thanks for chiming in. Responses inline.

    > On Fri, May 17, 2019 at 12:32 AM Martin Maechler <maechler using stat.math.ethz.ch>
    > wrote:

    >> >>>>> Gabriel Becker
    >> >>>>>     on Thu, 16 May 2019 15:47:57 -0700 writes:
    >> 
    >> > Hi Hadley,
    >> > Thanks for the counterpoint. Response below.
    >> 
    >> > On Thu, May 16, 2019 at 1:59 PM Hadley Wickham <h.wickham using gmail.com>
    >> wrote:
    >> 
    >> >> The existing behaviour seems inutitive to me. I would consider these
    >> >> invariants for n vector x_i's each with size m:
    >> >>
    >> >> * nrow(rbind(x_1, x_2, ..., x_n)) equals n
    >> >>
    >> 
    >> > Personally, no I wouldn't. I would consider m==0 a degenerate case,
    >> where
    >> > there is no data, but I personally find matrices (or data.frames)
    >> with rows
    >> > but no columns a very strange concept. The converse is not true, I
    >> > understand the utility of columns but no rows, particularly in the
    >> > data.frame case, but rows with no columns are observations we didn't
    >> > observe anything about. Strange, imho.
    >> 
    >> Gabe, here I have to very strongly disagree.
    >> 
    >> Matrices (and higher order Arrays)  are  always definitely to
    >> behave "symmetrically" / "uniformly" with respect to all of their
    >> dimensions.
    >> 
    >> We (and the S developers before us) have always taken a lot of
    >> care trying to ensure that this is true.
    >> 
    >> So for the matrix case, if rows and columns behaved differently
    >> that would be a bug "by definition".
    >> 

    > I realize now I could have been  clearer/more  explicit about this, but I
    > wasn't  arguing that the behavior should be different between columns and
    > rows, just that the behavior in the rows case didn't necessarily make a ton
    > of sense to me.  I was arguing that a change to both rbind and cbind be
    > considered when all length zero vectors are passed, not that rbind change
    > without cbind also changing. I will admit even here to feeling much more
    > strongly about the data.frame case.

    > That said, I do see that the cbind/columns argument seems harder (though
    > not impossible) for me to make. And maybe that's a good enough reason not
    > to consider such a change, because as I say, I agree the symmetry is
    > important, and would (also) want  cbind to change the same way rbind did if
    > such a change  happened, and that might bother many? more people than the
    > rbind case would. Maybe not though, based on the other responses in the
    > thread.

    > Honestly,  the most intuitive thing for me if you rbind or cbind a bunch of
    > length zero vectors together would be a  0x0 matrix, at  the very least in
    > the non-named arguments case. Its  a matrix with 0 elements in it, after
    > all. It seems perhaps that my intuition  is just somewhat  non-standard
    > though.

I think  your "problem"  may be that you've not appreciated yet
the importance of   {0 x p}  and {n x 0}  matrices  and would
think all of these should be  {0 x 0} ?

Believe me we did quite a bit of reasoning and looking at
associative law and transitiveness etc at the time, which I can't easily
recall, but believe me that it has been very beneficial to
consistently deal with  n x 0   and  0 x d  matrices :
Much of R code could be simplified / automagically worked
correctly in edge cases, once such matrices were fulfilling
basic consistency identities.

Martin



More information about the R-devel mailing list