[Rd] 'data.frame' method for base::rep()

Martin Maechler maechler at stat.math.ethz.ch
Wed Aug 3 14:44:20 CEST 2011


>>>>> David Winsemius <dwinsemius at comcast.net>
>>>>>     on Tue, 2 Aug 2011 10:14:59 -0400 writes:

    > On Aug 2, 2011, at 7:55 AM, Liviu Andronic wrote:

    >> Dear R developers Would you consider adding a
    >> 'data.frame' method for the base::rep function? The need
    >> to replicate a df row-wise can easily arise while
    >> programming, and rep() is unable to handle such a
    >> case. See below.
    >>> x <- iris[1, ]
    >> 
    > x[ rep(1,2), ] # "works"

Yes, indeed, and that I think is my "definitive" answer
to the proposal.
Defining a rep() method for data frames seems much less sensible
First because one simple "substitute" exists (namely indexing,
see above), and to me, not the least because there are several problems /
questions that would have to be answered

- Why should rep() for data frame necessarily replicate rows and
  not columns?
- If some rows should be resampled, why each row exactly the
  same number of times?
- any solution that is not compatible to    x [ rep(i, k) , ]
  would be unsatisfactory
- What rownames should the new data frame get in case of "real"
  rownames (i.e., not the fast "1:n" pseudo-rownames)?
  The informal definition of a data frame says that the rownames
  must be unique.
 
  --> and of course, the indexing solution

     xx <- iris[ rep(1:nrow(iris), 3) , ]
 
  does implement one sensible way of producing unique row.names,
  {though, I must say, not the "optimal" one if the issue is efficiency}

Rather keep using [,] and let's not get into having to maintain
yet another data.frame method ..

Martin Maechler, ETH Zurich



More information about the R-devel mailing list