[Rd] predict.glm returns different results for the same model

Hadley Wickham h.wickham at gmail.com
Fri Apr 27 15:25:13 CEST 2018


Hi all,

Very surprising (to me!) and mystifying result from predict.glm(): the
predictions vary depending on whether or not I use ns() or
splines::ns(). Reprex follows:

library(splines)

set.seed(12345)
dat <- data.frame(claim = rbinom(1000, 1, 0.5))
mns <- c(3.4, 3.6)
sds <- c(0.24, 0.35)
dat$wind <- exp(rnorm(nrow(dat), mean = mns[dat$claim + 1], sd =
sds[dat$claim + 1]))
dat <- dat[order(dat$wind), ]

m1 <- glm(claim ~ ns(wind, df = 5), data = dat, family = binomial)
m2 <- glm(claim ~ splines::ns(wind, df = 5), data = dat, family = binomial)

# The model coefficients are the same
unname(coef(m1))
#> [1]  0.5194712 -0.8687737 -0.6803954  4.0838947  2.3908674  4.1564128
unname(coef(m2))
#> [1]  0.5194712 -0.8687737 -0.6803954  4.0838947  2.3908674  4.1564128

# But the predictions are not!
newdf <- data.frame(wind = seq(min(dat$wind), max(dat$wind), length = 5))
unname(predict(m1, newdata = newdf))
#> [1] 0.51947119 0.03208719 2.82548847 3.90883496 4.06743266
unname(predict(m2, newdata = newdf))
#> [1]  0.5194712 -0.5666554 -0.1731268  2.8134844  3.9295814

Is this a bug?

(Motivating issue from this ggplot2 bug report:
https://github.com/tidyverse/ggplot2/issues/2426)

Thanks!

Hadley



-- 
http://hadley.nz



More information about the R-devel mailing list