[R] Order of formula terms in model.matrix

Guelman, Leo leo.guelman at rbc.com
Mon Jan 18 17:22:42 CET 2016


Thanks Peter. That make sense. Nevertheless, what comes at a surprise to me (and maybe to others) is that one can potentially get different fits by simply swapping the terms in the model formula. 

Best,
Leo. 

-----Original Message-----
From: peter dalgaard [mailto:pdalgd at gmail.com] 
Sent: 2016, January, 18 11:16 AM
To: Guelman, Leo
Cc: r-help at r-project.org; Charles C. Berry
Subject: Re: [R] Order of formula terms in model.matrix


On 18 Jan 2016, at 16:49 , Guelman, Leo <leo.guelman at rbc.com> wrote:

> Is it really the same model?


No, and I didn't say that they would be. I did say that they would be in the all-factor case, which does seem to be right:

> df$trt <- factor(df$trt)
> fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
> fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) 
> plot(fitted(fit1), fitted(fit2)) # still differs
> df$x1 <- factor(sample(c(-1,1), 100, replace = TRUE))
> fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
> fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial) 
> plot(fitted(fit1), fitted(fit2)) # looks like it's on diagonal 
> identical(fitted(fit1), fitted(fit2)) # wrong check
[1] FALSE
> all.equal(fitted(fit1), fitted(fit2)) # better
[1] TRUE


-pd


>  
> Following the example provided by Lars:
>  
> set.seed(1)
> x1 <- rnorm(100)
> f1 <- factor(sample(letters[1:3], 100, replace = TRUE)) trt <- 
> sample(c(-1,1), 100, replace = TRUE) y <- factor(sample(c(0,1), 100, 
> T)) df <- data.frame(y=y, x1=x1, f1=f1, trt=trt)
>  
> fit1 <- glm(y ~ x1:trt + f1:trt, data = df, family = binomial)
> coef(fit1)
>  
> fit2 <- glm(y ~ f1:trt + x1:trt, data = df, family = binomial)
> coef(fit2)
>  
> identical(fitted(fit1), fitted(fit2))
> [1] FALSE
>  
>  
>  
> ______________________________________________________________________
> _
> 
> If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference.
> 
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future.
> 

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

_______________________________________________________________________
If you received this email in error, please advise the sender (by return email or otherwise) immediately. You have consented to receive the attached electronically at the above-noted email address; please retain a copy of this confirmation for future reference.  

Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation pour les fins de reference future.



More information about the R-help mailing list