[Rd] segfault in glm.fit (PR#14154)

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Dec 17 15:03:52 CET 2009


I cannot reproduce this on our x86_64 Fedora systems (and I tried all 
the usual tricks such as gctorture and valgrind to provoke a problem).
And I have fitted much larger GLMs many times over the last decade, so 
your 'bug summary' cannot be the whole story.

Your example is random and you haven't set a seed: to eliminate that 
there is something specific about the data you tried can you set one 
and tell us which failed.

One possibility is a compiler optimization bug, so can you please tell 
us what compilers were used with what flags to build this version of 
R, and if you built it yourself try it without optimization.  (The 
machines I used had GCC 4.3.2 and 4.4.1 with CFLAGS="-g -O3 -Wall 
-pedantic -mtune=core2" FFLAGS="-g -O -mtune=core2": higher levels of 
optimization have known problems with recent x86_64 versions of 
gfortran, and I am wondering if that is an underlying issue.)


On Thu, 17 Dec 2009, adrian at maths.uwa.edu.au wrote:

> Bug summary:
>             glm() causes a segfault if the argument 'data'
>             is a data frame with more than 16384 rows.
>
> Bug demonstration:
>
> -------input ---------------
>      N <- 16400
>      df <- data.frame(x=runif(N, min=1,max=2),y=rpois(N, 2))
>      glm(y ~ x, family=poisson, data=df)
>
> ------ output ---------------
>      *** caught segfault ***
>      address (nil), cause 'unknown'
>
>      Traceback:
> 1: ifelse(y == 0, 1, y/mu)
> 2: dev.resids(y, mu, weights)
> 3: glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
> etastart,     mustart = mustart, offset = offset, family = family,
> control = control,     intercept = attr(mt, "intercept") > 0)
> 4: glm(y ~ x, family = poisson, data = df)
>
> --------------------------------
>
> The code generates a segfault if the value of 'N' is greater than 16384.
>
> regards
> Adrian Baddeley
>
> ////////////////////////////////////////////////////////////
>
> --please do not edit the information below--
>
> Version:
> platform = x86_64-unknown-linux-gnu
> arch = x86_64
> os = linux-gnu
> system = x86_64, linux-gnu
> status =
> major = 2
> minor = 10.1
> year = 2009
> month = 12
> day = 14
> svn rev = 50720
> language = R
> version.string = R version 2.10.1 (2009-12-14)
>
> Locale:
> LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C
>
> Search Path:
> .GlobalEnv, package:stats, package:graphics, package:grDevices,
> package:utils, package:datasets, package:methods, Autoloads, package:base
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list