[R] mgcv pack- gam() - question about the by argument

Gustaf Granath Gustaf.Granath at ebc.uu.se
Wed Oct 22 14:05:28 CEST 2008


R fellows,
I hope my questions are not too much about statistical methodology.

I do lab experiments where y is measured at different x, using 
(plant)material originating from different sites (2-4). The response is 
non-linear in a way that I have turned to gam() (mgcv package) to 
investigate the shape of the response (maybe there are better ways). I 
want to test if the shape/pattern of the response differ between sites 
(i.e. should each site have its own curve). I was thinking (after I read 
Wood's reference manual) to use the by= argument and fit one model 
including the factor site and one without site, and then compare these 
models with AIC (or can the anova.gam() be trusted here?). Is this a 
correct interpretation and use of the by= argument??

I have also measured the same sample twice (weeks between) . What about 
including the factor Time using the by= argument to test if the curve 
differ between the two measurements (in a separate analysis looking each 
site individually)? I guess this will ignore the correlation within 
samples but maybe it can be used to get a rough idea?

Here is an example ( I know that sample size is low for a gam)
#Create data
y<-c(0.283,0.185,0.134,0.727,0.087,0.568,0.569,0.156,0.189,0.738,0.526,0.144,0.532,0.128,0.162,0.772,0.789,0.798,0.756,0.738,0.768,0.778,0.791,0.776,0.808,
0.756,0.742,0.744,0.758,0.774,0.768,0.780,0.782,0.789,0.756,0.805,0.800,0.804,0.790,0.781,0.757,0.780,0.744,0.783,0.774)
x<-c(45.1,43,48.5,40,47.1,42.8,40.3,45.3,45.6,37.5,38,46.1,40.5,40.2,46.5,27.3,24.9,27.7,24.8,26.6,25.1,23.5,26.3,26.6,27.6,
26.3,24.4,25.8,26.7,28,36,33.3,35.2,35.1,33.9,34.7,34.4,33.4,34.1,34.3,31,33.2,33.4,35.7,32.6)
Site<-rep(1:3,each=5,3)
mydata<-data.frame(y=y,x=x,Site=as.factor(Site))
library(mgcv)
#Model without site
mod.no.site<-gam(y~s(x),data=mydata)
#Model including the factor Site. id=1 to get the same smothing 
parameter at each factor level.
mod.with.site<-gam(y~Site+s(x,by=Site,id=1),data=mydata)
#AIC for the two models
AIC(mod.no.site,mod.with.site)

Thanks,

Gustaf Granath (phd student)



More information about the R-help mailing list