tabulate causes segmentation fault (PR#156)

ripley@stats.ox.ac.uk ripley@stats.ox.ac.uk
Sun, 4 Apr 1999 09:04:17 +0200


On Sun, 4 Apr 1999, Bill Venables wrote:

> >>>>> "Peter" == Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> writes:
> 
>     Peter> wvenable@arcola.stats.adelaide.edu.au writes:
> 
>     >> R : Copyright 1999, The R Development Core Team Version
>     >> 0.63.3 (March 6, 1999)
>     >> 
>     >> ....
>     >> 
>     >> [Previously saved workspace restored]
>     >> 
>     >> > tabulate(1:10, 5)
>     >> 
>     >> Process R:1 segmentation fault at Sat Apr 3 17:48:34 1999
> 
>     Peter> It wasn't...
> 
>     Peter> However, all it needs is a bit of defensive
>     Peter> programming.
> 
> Yes, Peter, but by whom?  My point was that this kind of massive
> bear trap was not something that should be the users' sole
> responsibility to avoid.  Having a simple exception capable of
> killing the session and losing all data is perhaps OK for
> MicroSoft, but it should not be OK for us.  I believe the flaw is
> still present in the nascent 0.64, which bothers me.
> 
> I think it should be solved at the C level since tabulate() is
> one of those things that you want to be ultra-slick, but if this
> is not possible right now, perhaps we will have to put up with a

I have put a change in at the C level _and_ altered the help page to
document what tabulate does and what its arguments are. I think the 
defensive programming should be at the C level, for users can call the 
entry point.

The code is now much cleaner (I think):

tabulate <- function(bin, nbins = max(1,bin))
{
    if(!is.numeric(bin) && !is.factor(bin))
	stop("tabulate: bin must be numeric or a factor")
    .C("tabulate",
       as.integer(bin),
       as.integer(length(bin)),
       as.integer(nbins),
       ans = integer(nbins))$ans
}

As tabulate() silently ignored negative integers, I thought it could
silently ignore ones beyond nbins too, especially as I have documented
this.

I reckon the answer for

> tabulate(numeric(0))
[1] 0

is wrong (I would have had a zero-length vector) but have left it for
compatibility.

> stop-gap measure.  For example you could change it in 0.64 to
> 
> tabulate <- function (bin, nbins = max(bin)) {  
>     if (!is.numeric(bin) && !is.factor(bin)) 
>         stop("tabulate: bin must be numeric or a factor")
> 
>     nbins <- max(0, nbins) 
>     if(!missing(nbins) && !all(OK <- (bin <= nbins))) 
> 	bin <- bin[OK]
>     n <- length(bin)
>     storage.mode(bin) <- "integer"
> 
>     .C("tabulate", bin, n, ans = integer(nbins))$ans
> }

Um. although not previously documented, tabulate is used with factors (as
in S).

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._