[R] Randomly remove condition-selected rows from a matrix

Stavros Macrakis macrakis at alum.mit.edu
Wed Dec 31 19:42:36 CET 2008


On Wed, Dec 31, 2008 at 12:44 PM, Guillaume Chapron
<carnivorescience at gmail.com> wrote:
>> m[-sample(which(m[,1]<8 & m[,2]>12),2),]
> Supposing I sample only one row among the ones matching my criteria. Then
> consider the case where there is just one row matching this criteria. Sure,
> there is no need to sample, but the instruction would still be executed.
> Then if this row index is 15, my instruction becomes which(15,1), and this
> can gives me any row from 1 to 15, which is not correct. I have to make a
> condition in case there is only one row matching the criteria.

Yes, this is a (documented!) design flaw in 'sample' -- see the man page.

For some reason, the designers of R have chosen to document the flaw
and leave it up to individual users to work around it rather than fix
it definitively.  A related case is sample(c(),0), which gives an
error rather than giving an empty vector, though in general R deals
with empty vectors correctly (e.g. sum(c()) => 0).

To my mind, it is bizarre to have an important basic function which
works for some argument lengths but not others.  The convenience of
being able to write sample(5,2) for sample(1:5,2) hardly seems worth
inflicting inconsistency on all users -- but perhaps one of the
designers of R/S can enlighten us on the design rationale here.

           -s



More information about the R-help mailing list