[R] How to extract partial predictions, package mgcv

David Winsemius dwinsemius at comcast.net
Wed Sep 16 19:46:30 CEST 2009


Your code minus the extraneous "lines"  was:

#plot the data
plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50)
sequence=order(x0)
lines(x0[sequence],pp_s.x0.[sequence])
rug(jitter(x0))

My edition was:

plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
sequence=order(f$x0)
lines(f$x0[sequence],f$pp_s.x0.[sequence])
rug(jitter(f$x0))

So almost every one of the last set of plotting needed one or more  
additions
of "f$" or "data=f" in order to run without error. Perhaps you  
attached "f"
earlier and didn't include that code? I personally have given up using  
attach()
for just that reason.

-- 
David.


On Sep 16, 2009, at 1:31 PM, Staffan Roos wrote:

>
> Hi David,
>
> Yes, the strange code "lines" was clearly a mistake from my side.  
> Sorry.
> What "dataframe references" did you add in your code?
>
> /Staffan
>
>
>
> David Winsemius wrote:
>>
>> But removing the extraneous second to last line that says just:
>>
>> lines
>>
>> ... would leave the console looking less puzzling.
>>
>> --  
>> David
>> On Sep 16, 2009, at 1:02 PM, David Winsemius wrote:
>>
>>> It is nice to see worked examples, but there were some errors that
>>> relate to not including dataframe references. I've paste in code  
>>> that
>>> runs without error on my machine:
>>>
>>> On Sep 16, 2009, at 12:39 PM, Staffan Roos wrote:
>>>
>>>>
>>>> Thanks Simon,
>>>>
>>>> I wasn't aware that "predict.gam" could be used in this  
>>>> situation. I
>>>> followed you advice and managed to extract the data I needed.
>>>>
>>>> For people interested, I add the code with comments I used here:
>>>>
>>>> #############################################
>>>> # Full code for extracting partial predictions
>>>> # Start the package mgcv
>>>> library(mgcv)
>>>>
>>>> # Simulate some data (this is from Simon Wood's example in the
>>>> # package "mgcv" manual)
>>>> n<-200
>>>> sig <- 2
>>>> dat <- gamSim(1,n=n,scale=sig)
>>>>
>>>> # Check the data
>>>> dat
>>>>
>>>> ## It looks alright :-)
>>>>
>>>> # Run the GAM
>>>> b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)
>>>>
>>>> # Plot the partial predictions (You may need to rescale your window
>>>> to
>>>> # see the non-linearity
>>>> par(mfrow=c(1,3))
>>>> plot(b, scale=0)
>>>>
>>>> ### Clearly, the variables x0 and x2 are non-linear!
>>>
>>>> # I would like to extract the "solid line" in the graphs and the
>>> # associated standard errors from the plots. Thus, I use "predict"
>>> # and change to a data.frame:
>>> c<-data.frame(predict(b,type="terms",se.fit=TRUE)$fit)
>>> names(c)<-c("pp_s.x0.","pp_s.I.x1.2..","pp_s.x2.")
>>>
>>> d<-data.frame(predict(b,type="terms",se.fit=TRUE)$se.fit)
>>> names(d)<-c("se_s.x0.","se_s.I.x1.2..","se_s.x2.")
>>>
>>> # Merge the three datasets:
>>> f=cbind(dat,c,d)
>>>
>>> #Restrict the number of variables to the ones I am interested in:
>>> f<-f[,c("x0","pp_s.x0.", "se_s.x0.")]
>>> names(f)
>>>
>>> ### This data, i.e. "f", can now be exported or used for further
>>> ### calculations within R
>>>
>>> #plot the data
>>> par(mfrow=c(1,1))
>>> plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
>>> sequence=order(f$x0)
>>> lines(f$x0[sequence],f$pp_s.x0.[sequence])
>>> rug(jitter(f$x0))
>>>
>>>> ##########################################
>>>>
>>>>
>>>> Staffan,
>>>>
>>>> Take a look at ?predict.gam. You need to use type="terms" with
>>>> se=TRUE to
>>>> get
>>>> the quantities that you need.
>>>>
>>>> Simon
>>>>
>>>> ps. with binary data,  method="REML" or method="ML" for the gam fit
>>>> are
>>>> often
>>>> somewhat more reliable than  the default.
>>>>
>>>> On Monday 14 September 2009 19:30, Staffan Roos wrote:
>>>>> Dear package mgcv users,
>>>>>
>>>>>
>>>>>
>>>>> I am using package mgcv to describe presence of a migratory bird
>>>>> species
>>>>> as
>>>>> a function of several variables, including year, day number (i.e.
>>>>> day-of-the-year), duration of survey, latitude and longitude.
>>>>> Thus, the
>>>>> "global model" is:
>>>>>
>>>>> global_model<-gam(present ~ as.factor(year) + s(dayno, k=5) +
>>>>> s(duration,
>>>>> k=5) + s(x, k=5) + s(y, k=5), family = binomial(link = logit),
>>>>> data =
>>>>> presence, gamma =1.4)
>>>>>
>>>>> The model works fine, suggesting that all the variables have  
>>>>> strong,
>>>>> non-linear, influences on the probability of detecting the
>>>>> species. My
>>>>> main
>>>>> interest is in the effect "dayno" has on presence, given the
>>>>> inclusion of
>>>>> the other explanatory variables. Thus, I would like to extract the
>>>>> values
>>>>> of the partial prediction of "dayno" and its associated 2 standard
>>>>> errors
>>>>> above and below the main effect, as shown by the code
>>>>> "plot(global_model)".
>>>>>
>>>>> I have tried to extract the values by using
>>>>> "fitted.values(global_model,dayno)", but when plotted, the figure
>>>>> doesn't
>>>>> look like the partial prediction for "dayno". Instead, it gives me
>>>>> a very
>>>>> scattered figure ("shotgun effect").
>>>>>
>>>>> Has anyone tried to extract the partial predictions? If so, please
>>>>> could
>>>>> you advise me how to do this?
>>>>>
>>>
>>>>> Staffan
>>>>>
>>>>> P.S.. For comparison, please have a look at Simon Woods paper in R
>>>>> News,
>>>>> 1(2):20-25, June 2001, especially the figures showing the partial
>>>>> predictions of Mackerel egg densities. Using those graphs as an
>>>>> example, I
>>>>> would like to extract the partial predictions for e.g.
>>>>> "s(b.depth)", given
>>>>> the inclusion of the other variables.
>>>>>
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/How-to-extract-partial-predictions%2C-package-mgcv-tp25441149p25477056.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list