[Rd] nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri May 17 09:32:56 CEST 2019


>>>>> Gabriel Becker 
>>>>>     on Thu, 16 May 2019 15:47:57 -0700 writes:

    > Hi Hadley,
    > Thanks for the counterpoint. Response below.

    > On Thu, May 16, 2019 at 1:59 PM Hadley Wickham <h.wickham using gmail.com> wrote:

    >> The existing behaviour seems inutitive to me. I would consider these
    >> invariants for n vector x_i's each with size m:
    >> 
    >> * nrow(rbind(x_1, x_2, ..., x_n)) equals n
    >> 

    > Personally, no I wouldn't. I would consider m==0 a degenerate case, where
    > there is no data, but I personally find matrices (or data.frames) with rows
    > but no columns a very strange concept. The converse is not true, I
    > understand the utility of columns but no rows, particularly in the
    > data.frame case, but rows with no columns are observations we didn't
    > observe anything about. Strange, imho.

Gabe, here I have to very strongly disagree.

Matrices (and higher order Arrays)  are  always definitely to
behave "symmetrically" / "uniformly" with respect to all of their dimensions.

We (and the S developers before us) have always taken a lot of
care trying to ensure that this is true.

So for the matrix case, if rows and columns behaved differently
that would be a bug "by definition".

Of course there's one thing where this uniformity / symmetry
must be violated: in the coercion from and to atomic vectors:
There, 'by column' (generalized for arrays to "earlier dimensions vary faster
than later one") has been chosen, not the least because this had
been adapted for Fortran (first, AFAIK) and all related ABIs
dealing with Matrix vector arithmetic for very good (numerical,
performance, known convention) reasons that enabled to know how
fast numerical linear algebra should be implemented.

Martin



More information about the R-devel mailing list