[R] predict.lrm vs. predict.glm (with newdata)

Stefan Th. Gries stgries at gmail.com
Thu Dec 30 13:09:14 CET 2010


Hi all

Trying again with this question. (If it's really that stupid a
question, I'd be happy to just receive a link or one or two words to
Google that I haven't thought of before):

I have run into a case where I don't understand why predict.lrm and
predict.glm don't yield the same results.

# My data look like this:
set.seed(1)
library(Design); ilogit <- function(x) { 1/(1+exp(-x)) }
ORDER <- factor(sample(c("mc-sc", "sc-mc"), 403, TRUE))
CONJ <- factor(sample(c("als", "bevor", "nachdem", "weil"), 403, TRUE))
LENGTH_DIFF <- sample(-32:25, 403, TRUE)

# I fit two models:
model.01.lrm <- lrm(ORDER ~ CONJ*LENGTH_DIFF)
model.01.glm <- glm(ORDER ~ CONJ*LENGTH_DIFF, family=binomial)

# Then I wanted to plot for each level of CONJ the predicted
probabilities of ORDER="sc-mc" across a large range of LENGTH_DIFF
values. Thus I use this (yes, I know about type="response" etc.):
# range of LENGTH_DIFF for CONJ="als"
x1 <- factor(rep("als", 201)); x2 <- seq(-100, 100, 1)
y.als.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))
y.als.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))

# This works perfectly and the two vectors y.als.lrm and y.als.glm are
the same. But then:
# range of LENGTH_DIFF for CONJ="bevor"
x1 <- factor(rep("bevor", 201)); x2 <- seq(-100, 100, 1)
y.bevor.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))
y.bevor.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))

# Here, y.bevor.lrm is the same as y.als.lrm, but *not* the same as y.bevor.glm
# range of LENGTH_DIFF for CONJ="nachdem"
x1 <- factor(rep("nachdem", 201)); x2 <- seq(-100, 100, 1)
y.nachdem.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))
y.nachdem.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))

# Here, y.nachdem.lrm is the same as y.als.lrm, but *not* the same as
y.nachdem.glm
# range of LENGTH_DIFF for CONJ="weil"
x1 <- factor(rep("weil", 201)); x2 <- seq(-100, 100, 1)
y.weil.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))
y.weil.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1,
LENGTH_DIFF=x2)))

# Here, y.weil.lrm is the same as y.als.lrm, but *not* the same as y.weil.glm
#As a result, in this plot, the left panel is useless, but the right
has the four different curves:
par(mfrow=c(1,2))
plot(x2, y.als.lrm, pch="a", xlim=c(-100, 100), ylim=c(0, 1),
main="with predict.lrm", xlab="Main cl. length - subord. cl. length
(in words)", ylab="Predicted probability of 'sc-mc'");  grid()
  points(x2, y.bevor.lrm, pch="b")
  points(x2, y.nachdem.lrm, pch="n")
  points(x2, y.weil.lrm, pch="w")
plot(x2, y.als.glm, pch="a", xlim=c(-100, 100), ylim=c(0, 1),
main="with predict.glm", xlab="Main cl. length - subord. cl. length
(in words)", ylab="Predicted probability of 'sc-mc'");  grid()
  points(x2, y.bevor.glm, pch="b")
  points(x2, y.nachdem.glm, pch="n")
  points(x2, y.weil.glm, pch="w")
par(mfrow=c(1,1))

What am I doing wrong with predict.lrm?

Thanks in advance for any input you may have,
STG
--
Stefan Th. Gries
-----------------------------------------------
University of California, Santa Barbara
http://www.linguistics.ucsb.edu/faculty/stgries



More information about the R-help mailing list