[R] Interaction term not significant when using glm???

(Ted Harding) Ted.Harding at manchester.ac.uk
Sat Mar 7 15:20:14 CET 2009


On 07-Mar-09 10:57:17, Thomas Lumley wrote:
> On Fri, 6 Mar 2009, joris meys wrote:
>> Dear all,
>> I have a dataset where the interaction is more than obvious,
>> but I was asked to give a p-value, so I ran a logistic regression
>> using glm. Very funny, in the outcome the interaction term is NOT
>> significant, although that's completely counterintuitive. There
>> are 3 variables : spot (binary response), constr (gene construct)
>> and vernalized (growth conditions). Only for the FLC construct
>> after vernalization, the chance on spots should be lower. So in
>> the model one would suspect the interaction term to be significant.
>>
>> Yet, only the two main terms are significant here. Can it be my
>> data is too sparse to use these models? Am I using the wrong method?
> 
> The point estimate for the interaction term is large: 1.79, or an
> odds ratio of nearly 6.
> 
> The data are very strongly overdispersed (variance is 45 times larger
> than it should be), so they don't fit a binomial model well. If you
> used a quasibinomial model you would get no statistical significance
> for any of the terms.
> 
> I would say the problem is partly combination of the overdispersion and
> the sample size.  It doesn't help that the situation appears to be a
> difference between the FLC:yes cell and the other three cells, a
> difference that is spread out over the three parameters.
>       -thomas

The following way of looking at it may be helpful. Display the data
as two 2x2 tables (one for each level of 'constr'):

                 Spot                            Spot
constr="FLC"    1    0          constr="free"   1    0
--------------+-------+---      --------------+-------+---
Vern = "yes": |20   27| 47      Vern = "yes":  42    3| 45
              |       |                       |       |
Vern = "no" : |42    3| 45      Vern = "no" : |44    1| 45
--------------+-------+---      --------------+-------+---
              |62   30| 92                    |86    4| 90

It seems clear that, in the constr="free" table, there is a close
approximation to no information about the relationship between
'vernalized' and 'spot'. Given the margins, even the most extreme
possible tables (by col: (45,41)/(0,4) and (41,45)/4,0)) have
probabilities 0.058 of occurring. Other possibilities give
probabilities 0.250, 0.384,0.250.

On the other hand, the constr="FLC" table shows a very marked
association between 'vernalized' and 'spot'.

But, given that there is not much information on the "free" table,
you are not going to find an interaction between 'constr' and
'vernalized'. (You could try out the glm() for each of the possible
"free" tables, given the margins).

So, in my view, the aetiology of the symptoms is hypospotification
in the "free" lifestyle ... Treatment: Increase your intake of
"free"! Then you may get enough information about association in
that case.

Ted.


>> # data generation
>> testdata <-
>> matrix(c(rep(0:1,times=4),rep(c("FLC","FLC","free","free"),times=2),
>>  rep(c("no","yes"),each =4),3,42,1,44,27,20,3,42),ncol=4)
>> colnames(testdata) <-c("spot","constr","vernalized","Freq")
>> testdata <- as.data.frame(testdata)
>>
>> # model
>> T0fit <- glm(spot~constr*vernalized, weights=Freq, data=testdata,
>> family="binomial")
>> anova(T0fit)
>>
>> Kind regards
>> Joris
>>
>>      [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> Thomas Lumley                 Assoc. Professor, Biostatistics
> tlumley at u.washington.edu      University of Washington, Seattle
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 07-Mar-09                                       Time: 14:14:06
------------------------------ XFMail ------------------------------




More information about the R-help mailing list