[R] something wrong when using pspline in clogit?

Wed Jan 22 13:02:03 CET 2003

I'm afraid I missed the start of this due to the rather general subject
line, but I think that you wanted an interaction of a smooth and a factor
variable, is that right? Also, I'm not sure whether my suggestion will be
any use given your survival analysis context, but anyway... 

If I understand your code (below) correctly then you are trying to produce
an interaction of a smooth with a factor by setting the argument of the
smooth to zero if the corresponding factor dummy variables are zero. I
don't think this is quite right since s(0) is not generally zero. If you
want to do this properly, then you can use the `by' variable mechanism
provided in package mgcv. The final example in the ?gam.models help file
produces an interaction of a smooth with a factor correctly. 

I think that package gss also allows interactions of smooths and factors,
but I don't actually know how I'm afraid.

Simon
_____________________________________________________________________
> Simon Wood simon at stats.gla.ac.uk  www.ruwpa.st-and.ac.uk/simon.html
>>  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
>>>   Direct telephone: (0)141 330 4530          Fax: (0)141 330 4814

On Wed, 22 Jan 2003, Vumani Dlamini wrote:

> 
> 
> Dear R users:
> 
> I am not entirely convinced that clogit gives me the correct result when I 
> use pspline() and maybe you could help correct me here.
> 
> When I add a constant to my covariate I expect only the intercept to change, 
> but not the coefficients. This is true (in clogit) when I assume a linear in 
> the logit model, but the same does not happen when I use pspline().
> 
> If I did something similar to what "na.gam.replace" in S+ does, the pspline 
> coefficients are the same, leading me to believe that "na.gam.replace" is 
> doing the right thing (I may be wrong).
> The strange thing here is that even the intercepts are the same. Lastly, 
> these coefficients are the same as using clogit before adding the constant, 
> but the intercept are different.
> 
> I have attached my code for all this, maybe I messed it somewhere.
> 
> 
> Vumani
> 
> ##########################
> xvar<-rnorm(100,0,1)
> data.mult<-rbinom(100,size=3,prob=(exp(0.5+0.4*xvar)/(1+exp(0.5+0.4*xvar))))
> library(Nnet)
> mult.fit<-multinom(data.mult~xvar)
> coef(mult.fit)
> library(survival)
> choice<-c(ifelse(data.mult==0,1,0),ifelse(data.mult==1,1,0),ifelse(data.mult==2,1,0),ifelse(data.mult==3,1,0))
> temp<-list(time=2-choice,
>             status=choice,
>             
> intercept1=c(rep(0,length(data.mult)),rep(1,length(data.mult)),rep(0,length(data.mult)),rep(0,length(data.mult))),
>             
> intercept2=c(rep(0,length(data.mult)),rep(0,length(data.mult)),rep(1,length(data.mult)),rep(0,length(data.mult))),
>             
> intercept3=c(rep(0,length(data.mult)),rep(0,length(data.mult)),rep(0,length(data.mult)),rep(1,length(data.mult))),
>             
> xvar1=c(rep(0,length(data.mult)),rep(1,length(data.mult)),rep(0,length(data.mult)),rep(0,length(data.mult)))*rep(xvar,4),
>             
> xvar2=c(rep(0,length(data.mult)),rep(0,length(data.mult)),rep(1,length(data.mult)),rep(0,length(data.mult)))*rep(xvar,4),
>             
> xvar3=c(rep(0,length(data.mult)),rep(0,length(data.mult)),rep(0,length(data.mult)),rep(1,length(data.mult)))*rep(xvar,4),
>             stratum=rep(1:length(data.mult),4))
> # Used to check whether my data matrix is setup correctly
> # This fits a category one baseline category logit model
> # Compare with multinom
> totalmodel<-clogit(status~intercept1+intercept2+intercept3+xvar1+xvar2+xvar3+strata(stratum),method="exact",temp)
> coef(totalmodel)
> # Use smooth term, "pspline"
> totalmodel<-clogit(status~intercept1+intercept2+intercept3+pspline(xvar1)+pspline(xvar2)+pspline(xvar3)+strata(stratum),method="exact",temp)
> coef(totalmodel)
> # Collect indices where variable is not equal to zero
> indices<-list(var1=NULL,var2=NULL,var3=NULL)
> indices$var1<-temp$xvar1==0
> indices$var2<-temp$xvar2==0
> indices$var3<-temp$xvar3==0
> # Add 10 to covariate
> temp$xvar1[!indices$var1]<-xvar+10
> temp$xvar2[!indices$var2]<-xvar+10
> temp$xvar3[!indices$var3]<-xvar+10
> # First fit linear-in-the-logit model, to check what happens to coefficients
> totalmodel<-clogit(status~intercept1+intercept2+intercept3+xvar1+xvar2+xvar3+strata(stratum),method="exact",temp)
> coef(totalmodel)
> # Fit a smooth model, and compare with smooth model above
> totalmodel<-clogit(status~intercept1+intercept2+intercept3+pspline(xvar1)+pspline(xvar2)+pspline(xvar3)+strata(stratum),method="exact",temp)
> coef(totalmodel)
> # Doing something similar to what na.gam.replace() in S+ does
> temp$xvar1[indices$var1]<-mean(xvar+10)
> temp$xvar2[indices$var2]<-mean(xvar+10)
> temp$xvar3[indices$var3]<-mean(xvar+10)
> # Fit spline model
> totalmodel<-clogit(status~intercept1+intercept2+intercept3+pspline(xvar1)+pspline(xvar2)+pspline(xvar3)+strata(stratum),method="exact",temp)
> coef(totalmodel)
> # Remove 10 and substitute with mean
> # Covariate are like in the beginning
> temp$xvar1[!indices$var1]<-xvar
> temp$xvar2[!indices$var2]<-xvar
> temp$xvar3[!indices$var3]<-xvar
> # Doing something like na.gam.replace()
> temp$xvar1[indices$var1]<-mean(xvar)
> temp$xvar2[indices$var2]<-mean(xvar)
> temp$xvar3[indices$var3]<-mean(xvar)
> totalmodel<-clogit(status~intercept1+intercept2+intercept3+pspline(xvar1)+pspline(xvar2)+pspline(xvar3)+strata(stratum),method="exact",temp)
> coef(totalmodel)
> 
> 
> 
> 
> 
> 
> 
> >From: ripley at stats.ox.ac.uk
> >To: Vumani Dlamini <dvumani at hotmail.com>
> >CC: R-help at stat.math.ethz.ch
> >Subject: Re: [R] R analogue
> >Date: Mon, 20 Jan 2003 13:17:02 +0000 (GMT)
> >
> >On Mon, 20 Jan 2003, Vumani Dlamini wrote:
> >
> > > Is there any R analogue for the S+ function "na.gam.replace".
> >
> >No, for it is tailored for use by S's gam.
> >
> >Some of the things it does are positively undesirable!  It uses mean
> >imputation for continuous variables, but for factors it makes NA into
> >another level, which silently assumes that all missing values are similar
> >and that they are going to occur with sufficient frequency in the
> >training data (and they may not occur at all).
> >
> > > I would like
> > > to make an interaction of a categorical and smooth continuous covariate.
> >
> >You can do that without na.gam.replace: there is an example in the MASS
> >scripts for the low-birth-weight data.
> >
> >
> >--
> >Brian D. Ripley,                  ripley at stats.ox.ac.uk
> >Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> >University of Oxford,             Tel:  +44 1865 272861 (self)
> >1 South Parks Road,                     +44 1865 272866 (PA)
> >Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>