[R] Calculating odds ratios from logistic GAM model

David Winsemius dwinsemius at comcast.net
Thu Dec 9 15:11:27 CET 2010


On Dec 9, 2010, at 7:14 AM, Denis.Aydin at unibas.ch wrote:

> Dear R-helpers
> I have a question related to logistic GAM models. Consider the  
> following
> example:
> # Load package
> library(mgcv)
>
> # Simulation of dataset
> n <- 1000
> set.seed(0)
> age            <- rnorm(n, 50, 10)
> blood.pressure <- rnorm(n, 120, 15)
> cholesterol    <- rnorm(n, 200, 25)
> sex            <- factor(sample(c('female','male'), n,TRUE))
>
> L <- 0.4*(sex=='male') + 0.045*(age-50) + (log(cholesterol -
> 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
> y <- ifelse(runif(n) < plogis(L), 1, 0)
>
>
> I now want to fit a logistic GAM model and model age as a cubic  
> spline:
>
> fit <- gam(y ~ blood.pressure + sex  + cholesterol + s(age, bs="cr")
> ,family=binomial)

I'm wondering if there might be a problem with my understanding of the  
appropriate terminology. Why would such a model be called logistic?  
There is no parametric relationship between some reference set and the  
rest of the prediction space. And I'm also wondering why one would  
even _want_ an odds ratio? Odds ratios were always an approximation to  
what one really wanted, namely either a proportion or a rate ratio. We  
students were asked to readjust our brains to conform to the  
deliverables from the rather twisted (I suppose "transformed" would be  
more accurate) mechanics of "logistic" regression,  and we dutifully  
did so with varying degrees iof success. But now ...it seems it should  
be perfectly acceptable to leave that cognitive tunnel behind and use  
the methods available capable of generating perfectly sensible output  
using "predict" methods.

(This does lead to the the answer I had originally started to  
write.... just pick a reference category and use predict with  
type="response". And if you understand what odds are, and many people  
are incapable of giving a mathematically correct definition, then it's  
pretty straightforward.)


>
> Now my question: In a normal logistic regression, the odds ratio (OR)
> simply is the exponentiated coefficient exp(beta).
> How is it possible to calculate the odds ratio for age (in this  
> example)
> based on the spline? For example the odds ratio based on the spline
> between the age of, say, 20-30?
> Or even better: How can I plot the odds ratios against age in a  
> continuous
> form?

And my counter-question .... why would we want to? Why are you  
ignoring the predict(model, type="response") facilities?

>
> Many thanks for your help.
>
> Best,
> Denis Aydin
>
-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list