[R] VGAM and vglm

Mon Oct 29 23:42:11 CET 2007

Hi Folks,
I wonderif someone who is familiar with the details
of vglm in the VGAM package can assist me. I'm new
to using it, and there doesn;t seem much in the
documentation that's relevant to the question below.

Say I have a vector x of 0/1 responses and another
vector y of 0/1 responses, these in fact being a
bivariate set of 0/1 responses equivalent to
cbind(x,y).

E.g.
set.seed(12345)
x<-sample(c(0,1),50,replace=TRUE)
y<-sample(c(0,1),50,replace=TRUE)
table(x,y)
##    y
##    0  1 
##x 0 10 17
##  1 11 12

Now, it seems to me that if I use vglm to fit cbind(x,y)
it simply produces separate fits for x and y, without
taking account of their association. Thus, first (some
lines of the output removed):

summary(vglm(cbind(x,y)~1,family=binomialff(mv=TRUE)))
##Coefficients:
##                 Value Std. Error  t value
##(Intercept):1 -0.16034    0.28375 -0.56508
##(Intercept):2  0.32277    0.28652  1.12651
##
##Number of linear predictors:  2 
##Names of linear predictors: logit(E[x]), logit(E[y])

Now if I shuffle say y:

y1<-sample(y,50)
table(x,y1)
##    y1
##    0  1 
##x 0 11 16
##  1 10 13

summary(vglm(cbind(x,y1)~1,family=binomialff(mv=TRUE)))
##Coefficients:
##                 Value Std. Error  t value
##(Intercept):1 -0.16034    0.28375 -0.56508
##(Intercept):2  0.32277    0.28652  1.12651
##
##Number of linear predictors:  2 
##Names of linear predictors: logit(E[x]), logit(E[y1])

Thus getting exactly the same coefficients; indeed the
results are exactly what I would get if I did separate
glm fits in the first place:

summary(glm(x~1,family=binomial))
##Call:
##glm(formula = x ~ 1, family = binomial)
##Coefficients:
##            Estimate Std. Error z value Pr(>|z|)
##(Intercept)  -0.1603     0.2838  -0.565    0.572

##summary(glm(y~1,family=binomial))
##Coefficients:
##            Estimate Std. Error z value Pr(>|z|)
##(Intercept)   0.3228     0.2865   1.126     0.26

Plus I get the P-values thrown in for good measure!

I had been hoping that vglm, with the "mv=TRUE" option
set in the family=binomialff(mv=TRUE) option, would
treat cbind(x,y) as multivariate data and take account
of their association in preforming the fit (i.e., in
the above example where there are no covariates, would
fit the 2x2 table). This issue becomes more interesting
when there are covariates as well. In fact, it seems to
simply fit the marginals.

So I'm wondering what is to be gained by using vglm
instead of simple glm?

Or is there something I've overlooked, which would
cause a true multivaruate fit to to be done by vglm?

If course, I can use brute force to fit the 2x2 table:

summary(vglm(cbind(x*y,x*(1-y),(1-x)*y,(1-x)*(1-y))~1,
        family=binomialff(mv=TRUE)))
##Coefficients:
##                Value Std. Error t value
##(Intercept):1 -1.1527    0.33113 -3.4810
##(Intercept):2 -1.2657    0.34139 -3.7073
##(Intercept):3 -0.6633    0.29854 -2.2218
##(Intercept):4 -1.3863    0.35355 -3.9210
##
##Number of linear predictors:  4 
##Names of linear predictors: logit(mu1),
##  logit(mu2), logit(mu3), logit(mu4)

but that seems to be going to greater lengths than
one should need to!

Any comments and suggestions will be welcome!

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 29-Oct-07                                       Time: 23:42:08
------------------------------ XFMail ------------------------------