[R] fit.contrast and interaction terms

Berta ibanez at bioef.org
Thu Oct 11 13:23:15 CEST 2007


Hi Gregory, thank you so much for your answer, the use of estimable() gave 
me interesting hints.
Following the same small example as before, I observed that "fit.contrast" 
and "estimable" give the same response when contrasting A vs D in the 
categorial variable x (differences among groups at the intercept term), and 
estimable allows you to fix the continuous variable x2 to a given value (say 
2) and contrast if the differences are significant between groups at this 
fixed numeric variable (see example + results below). However, in my real 
data, the resutls are obtained but a warning appears: "Warning message: 
Degrees of freedom vary among parameters used to construct linear 
contrast(s): 1. Using the smallest df among the set of parameters. in: 
estimable(frec.intermixto, c("xD"=1, "edad:xD"=40)).  I suspect that 
something is wrong with the d.f. (it seems that narrower conf. int than 
expected are obtained). Is it a wrong conclusion to interpretate the 
p-values? Is there another way to adjuts this, or it is just inappropriate 
to do this type of contrast?

Thanks a lot in advance!
Berta

SMALL EXPAMPLE
library(gmodels)
set.seed(100)
y <- rnorm(100)
x <-  cut(rnorm(100, mean=y, sd=0.25),c(-4,-1.5,0,1.5,4)) ; levels(x)<- 
c("A","B","C","D")
x2 <- rnorm(100,mean=y,sd=0.5)
reg2 <- lm(y ~ x * x2 )
summary(reg2)
fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1
#                 Estimate Std. Error t value     Pr(>|t|)
#x c=( -1 0 0 1 ) 3.067346  0.7645143 4.01215 0.0001224165
estimable(reg2, c("xD"=1) )
#                    Estimate Std. Error  t value DF     Pr(>|t|)
#(0 0 0 1 0 0 0 0) 3.067346  0.7645143 4.01215 92 0.0001224165
estimable(reg2, c("xD"=1, "xD:x2"=2) )
#                  Estimate Std. Error  t value DF     Pr(>|t|)
#(0 0 0 1 0 0 0 2) 3.836914  0.6408911 5.986843 92 4.083757e-08

win.graph()
plot(c(min(x2), max(x2)), c(min(y),max(y)), type="n", xlab="x2", ylab="y")
points(c(0, max(x2)), c(-1.936, -1.936+0.032*max(x2)), type="l", lty=1)
points(c(0, max(x2)), c(-1.936+3.067, 
(-1.936+3.067+(0.032+0.3847)*max(x2))), type="l", lty=2)
title("XA vs XD")
legend(-2,2,c("XA", "XD"), lty=c(1,2))

###########################################################################
###########################################################################
##########################################################################







>From: Gregory Warnes <gregory.warnes at mac.com>
>Date: Tue, 9 Oct 2007 09:20:51 -0400
>To: Berta <ibanez at bioef.org>
>X-Mailer: Apple Mail (2.752.3)
>X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
>Cc: r-help at stat.math.ethz.ch


>Hello Berta,
>
>gmodels::fit.contrasts() simply performs a single-variable contrast
>using the model you have specified.   To perform the more involved
>contrasts that you are describing, there are two approaches:
>
>1) use the estimable() function in the gmodels package.
>gmodels::estimable() allows you to provides an arbitrary function to
>be applied to the estimated model parameters, allowing you to perform
>any appropriate (or inappropriate) contrast calculation.
>
>2) Use the contrast functions provided by Frank Harrell's Hmisc
>package.   Frank's functions allow you to specify the desired  value
>or level of each parameter for the contrast, and handle unspecified
>parameters.
>
>I hope this helps,
>
>-Greg
>
>
>
>On Oct 9, 2007, at 6:10AM , Berta wrote:
>
> > Dear R-users,
> > I want to fit a linear model with Y as response variable and X a
> > categorical variable (with 4 categories), with the aim of comparing
> > the basal category of X (category=1) with category 4.
> > Unfortunately, there is another categorical variable with 2
> > categories which interact with x and I have to include it, so my
> > model is s  "reg3: Y=x*x3". Using fit.contrast to make the contrast
> > (category 1 vs category 4) with options(contrasts=c
> > ("contr.treatment", "contr.poly")), it makes the contrast but just
> > for the basal category of x3, (coincident with the corresponding
> > result of summary(reg3)), so that it is not what I am looking for,
> > and it seems that when I write (contrasts=c("contr.sum",
> > "contr.poly")) before fit.contrast, it adjust for x3. Here I send a
> > SMALL EXAMPLE that tries to imiate my problem.
> >
> > library(gmodels)
> > set.seed(100)
> > options(contrasts=c("contr.treatment", "contr.poly"))
> > y <- rnorm(100)
> > x <-  cut(rnorm(100, mean=y, sd=0.25),c(-4,-1.5,0,1.5,4)) # 4
> > CATEGORIES
> > x3 <-  cut(rnorm(100, mean=y, sd=8),c(-50,0,50)) # 2 CATEGORIES
> > reg3<-lm(y~ x * x3 )
> > summary(reg3) # category 1 vs category 4 estimate: 3.84776 , for
> > basal category of x3
> > fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1  # category 1 vs
> > category 4 estimate: 3.84776 ,
> > options(contrasts=c("contr.sum", "contr.poly"))
> > fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 #  category 1 vs
> > category 4 estimate:  4.010439
> >
> > I deduce that the global comparison of category 1 vs category 4 is
> > 4.01, and not 3.84.
> >
> > Unfortunately, the real variable that interact with x is not
> > categorical but continuous in my real case, and the new model is
> > Y=x*x2. Again, with the contr.treatment option, the comparison of
> > category 1 vs category 4 is done for the intercept, that is, for
> > the numerical variable assumed to be 0. As i am interested in
> > comparing category 1 vs category 4 adjusting for x2 in general
> > terms, I use contr.sum as before, but it does not seem to produce
> > any modification:
> >
> > x2 <- rnorm(100,mean=y,sd=0.5) # NUMERIC
> > reg2 <- lm(y ~ x * x2 )
> > summary(reg2) # category 1 vs category 4 estimate: 3.067346 for x2=0
> > options(contrasts=c("contr.treatment", "contr.poly"))
> > fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for
> > x2=0
> > options(contrasts=c("contr.sum", "contr.poly"))
> > fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for
> > x2=0
> >
> > The question is: How could I contrast simulaneously in global terms
> > (not intercept and slope separately) if there are differences among
> > category 1 vs category 4 but adjusting for a continuous variable?
> > Or it is not possible to do so, and I have to contrast both
> > (difference of intercepts and difference of slopes separately) and
> > obtain a global conclussion?
> >
> > Thanks a lot in advance,
> >
> > Berta
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list