[R] Contingency table: logistic regression

Suresh Krishna ssk2031 at columbia.edu
Thu Mar 31 03:29:09 CEST 2005


Hi,

I am analyzing a data set with greater than 1000 independent cases 
(collected in an unrestricted manner), where each case has 3 variables 
associated with it: one, a factor variable with 0/1 levels (called XX), 
another factor variable with 8 levels (X) and a third response variable 
with two levels (Y: 0/1). I am trying to see if X1 has an effect on the 
relationship between X2 and the proportion of 1-s in Y.

I have three questions:

a) I have never used glm-s for this or any other sort of analysis before 
today, so am I interpreting the output correctly ?

After setting options(contrasts=c("contr.treatment","contr.poly"))

I did:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Begin R output~~~~~~~~~~~~~~~~~~~~~~
Call:
glm(formula = Y ~ X * Fac, family = "binomial", data = mat, subset = 
sactype < 3 & numstim == 16)

Deviance Residuals:
    Min      1Q  Median      3Q     Max
-2.232  -0.901   0.416   0.985   1.656

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)    2.405      0.209   11.52  < 2e-16 ***
X2            -2.511      0.293   -8.57  < 2e-16 ***
X3            -3.283      0.286  -11.47  < 2e-16 ***
X4            -2.009      0.302   -6.65    3e-11 ***
X5            -3.098      0.276  -11.22  < 2e-16 ***
X6            -2.580      0.288   -8.97  < 2e-16 ***
X7            -3.484      0.288  -12.09  < 2e-16 ***
X8            -2.811      0.328   -8.56  < 2e-16 ***
Fac           -1.558      0.721   -2.16  0.03071 *
X2:Fac         2.133      0.942    2.26  0.02351 *
X3:Fac         1.848      0.932    1.98  0.04748 *
X4:Fac         2.836      0.982    2.89  0.00386 **
X5:Fac         3.263      0.945    3.45  0.00056 ***
X6:Fac         3.630      0.971    3.74  0.00018 ***
X7:Fac         3.256      0.883    3.69  0.00023 ***
X8:Fac         3.350      1.000    3.35  0.00081 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

(Dispersion parameter for binomial family taken to be 1)

     Null deviance: 1619.4  on 1178  degrees of freedom
Residual deviance: 1271.2  on 1163  degrees of freedom
AIC: 1303

Number of Fisher Scoring iterations: 5

~~~~~~~~~~~~~~~~~~~~~~~~End R output~~~~~~~~~~~~~~~~~~~~~~~~~~~

I am reading this like this: each of the X2....X8 terms tell me whether 
the proportions associated with those factors at level 0 of Fac, are 
different from the proportion associated with factor X1 for level 0 of 
Fac. And each of the terms associated with Fac (X2:Fac,.......X8:Fac) is 
telling me whether the difference between X2...X8 and X1 is different 
for Fac=0 and Fac=1; and this is the same thing as whether the 
proportion associated with X2......X8 are different for the two levels 
of Fac. So these X2...X8:Fac terms are like performing a simple 2x2 
analysis of the effect of Fac on Y, given X2 (....X8).

How much of this is incorrect ?

My other two questions are:

b) Is this the right way to approach this analysis in R ? Or am I better 
off reading about multi-way contingency table analyses and using them ?

and

c) How do I incorporate a correction for multiple-testing into the above 
analysis ? The effect of Fac on the relationship between X and Y was 
planned.

I would greatly, and respectfully appreciate all pointers, tips and 
admonitions.

Thank you !!!!

Suresh




More information about the R-help mailing list