[Rd] sample (PR#1212)

Liaw, Andy andy_liaw@merck.com
Fri, 14 Dec 2001 08:35:56 -0500


> From: maechler@stat.math.ethz.ch [mailto:maechler@stat.math.ethz.ch]
> 
> >>>>> "possolo" == possolo  <possolo@crd.ge.com> writes:
> 
>     possolo> Full_Name: Antonio Possolo Version: 1.3.1 OS: Linux
>     possolo> (RH 7.1), Windows 2000 Submission from: (NULL)
> 
> 
>     possolo> A FEATURE THAT EASILY GENERATES BUGS
> 
>     possolo> sample(pi, size=1) produces 1, 2, or 3.
>     possolo> sample(c(pi, pi), size=1) produces 3.141593 always.
> 
>     possolo> Although this conforms with the behavior explained
>     possolo> in the help page for "sample", the behavior for the
>     possolo> case where x (in sample(x, ...)) has length 1 can
>     possolo> easily lead to errors if x is generated
>     possolo> automatically and one neglects to check its length
>     possolo> before sampling from it.
> 
> I completely share your opinion; and we (not I) only recently
> had a case where a user-written function did not work in some
> cases when     sample(iv, ...)   was used, with an integer 
> vector iv which sometimes (rarely) was of length one.
> In your case with `pi' (which is not an integer), we could make
> sample() give a warning at least instead of silently coercing to
> integer; in our case however, iv[] *is* an integer vector, just
> sometimes of length 1 which is inherently ``non-decidable''.
> 
> The reason sample() works as it does is S - compatibility
> and that *is* important.
> 
>     possolo> I believe it would be safest to require x to be
>     possolo> always the full set of values one wishes to sample
>     possolo> from, and remove the special meaning that is
>     possolo> attached to the case when x is of length 1.
> 
> What we *could* consider instead,  without breaking 
> back-compatibility,
> is adding an additional argument `isVector', e.g.
> 
>    sample (x, size, replace = FALSE, prob = NULL, isVector = FALSE) 
> 
> such that if you use
> 
>      sample(pi, size = 1, isV = TRUE)
> 
> you would always get 3.14159..
> This would add least make the user written code much nicer to
> read than (what is currently needed)
> 
>      i.rand <- if(length(iv) == 1) iv  else  sample(iv, .....)
> 
> 
> Other opinions?  {from Insightful as well -- this would be worth
> 		 doing in all implementations of S}

I like Martin's suggestion of adding an argument.  

What I have done previously, when I encountered this problem, is simply to
make a local copy of 'sample', called it 'mysample', and modify it so it
handles scalar argument as described above.  This is simple enough that I
wonder whether it is worth it to change the core code.  If R Core (and maybe
Insightful) are kind enough to make this change, I can only be happier.

Cheers,
Andy

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._