[R] 'mgcv' package, problem with predicting binomial (logit) data

Jef Vlegels Jef.Vlegels at UGent.be
Mon Aug 30 17:44:50 CEST 2010


There was a problem with the data in attachment, here is a better version.

Use something like: 
data <- read.table("pas_r.txt",header=TRUE,sep=";") 
to read it.

Thanks,
Jef

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: maandag 30 augustus 2010 17:22
To: Jef Vlegels
Subject: FYI offlist Re: [R] 'mgcv' package, problem with predicting
binomial (logit) data


On Aug 30, 2010, at 11:17 AM, Jef Vlegels wrote:

> Dear R-help list,
>
> I'm using the mgcv package to plot predictions based on the gam  
> function.
>
> I predict the chance of being a (frequent) participant at theater  
> plays vs.
> not being a participant by age.
> Because my outcome variable is dichotomous, I use the binomial  
> family with
> logit link function.
>
> Dataset in attachment, code to read it in R:

Nope. .sav files are not accepted by the mail-server. See the Posting  
Guide. (.txt and .pdf are the safest formats)


> data <- read.spss("pas_r.sav")
> attach(data)
>
> In a first step I use 'gam' to model my data and 'predict' to  
> calculate and
> plot the predicted values, this all works fine.
> My code looks like this:
>
> test <- gam(participant ~ s(age,fx=FALSE,bs='cr'),  
> family=binomial(logit))
> summary(test)
> plot(test, shade=TRUE)
> gam.check(test)
> test.pred <- predict(test,newdata=data,se.fit=TRUE,type='response',
> na.action=na.omit)
> I1<-order(age)
> plot(age[I1], test.pred$fit[I1],lty=1, type="l")
> lines(age[I1],test.pred$fit[I1]+2*M1pred$se[I1],lty=2)
> lines(age[I1],test.pred$fit[I1]-2*M1pred$se[I1],lty=2)
> plot(age[I1], test.pred$fit[I1] , type="l")
>
> I a second step, I want to calculate a similar model, but only for
> respondents with a certain characteristic. For example, in this  
> case, only
> for male respondents.
> I use a code that looks like this:
>
> participant_male <- participant[gender=="male"]
> age_male <- age[gender=="male"]
>
> test2<-gam(participant_male ~ s(age_male, fx=FALSE, bs="cr"),
> family=binomial(logit), na.action=na.omit)
> summary(test2)
> plot(test2, shade=TRUE)
>
> I get a nice smoother function in this plot, like I expected.
> Then, when I want to plot the predicted values, I use a code that  
> looks like
> this:
>
> Test2.pred <- predict(test5,se.fit=TRUE, type="response")
> I1<-order(age_male)
> plot(age_male[I1], test2.pred$fit[I1],lty=1)
>
> This last plot, of the predictions, is not what I expect. It's just  
> a random
> scatterplot, not what I would expect from the smoother plot. Does  
> anybody
> know what I did wrong?
>
> Thanks in advance,
> Jef Vlegels
>
> Jef Vlegels
> Ghent University - Department of Sociology
> Korte Meer 3, B-9000 Gent, Belgium
> Tel:  09 264 8343
> www.psw.UGent.be/sociologie
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pas_r.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100830/fcd2b6e1/attachment.txt>


More information about the R-help mailing list