[R] vglm(), t values and p values

Wed Nov 4 15:00:26 CET 2009

Hi All,

I'm fitting an proportional odds model using vglm() from VGAM.

My response variable is the severity of diseases, going from 0 to 5 (the 
severity is actually an ordered factor).

The independent variables are: 1 genetic marker, time of medical observation, 
age, sex. What I *need* is a p-value for the genetic marker. Because I have ~1.5 
million markers I'd rather not faffing around too much.

My model is:

 > mod.vglm = vglm(disease.status ~ x + time + age + sex, family = 
cumulative(par = T))

where x is my genetic marker, coded as 0/1/2, time is days of medical observation.

 > summary(mod.vglm) works:

Call:
vglm(formula = disease.status ~ x + time + age + sex, family = cumulative(par = T))

Pearson Residuals:
                    Min       1Q   Median       3Q     Max
logit(P[Y<=1]) -0.6642 -0.28704 -0.18329 -0.11681  3.8919
logit(P[Y<=2]) -2.5580 -0.48080 -0.23315  0.47388  2.5983
logit(P[Y<=3]) -2.1565 -0.56961  0.22089  0.44349 10.7964
logit(P[Y<=4]) -3.3175  0.13064  0.20117  0.43176 12.5233

Coefficients:
                     Value Std. Error  t value
(Intercept):1 -2.4460e+00 4.2791e-01  -5.7162
(Intercept):2 -7.1078e-01 4.1628e-01  -1.7074
(Intercept):3  3.7619e-01 4.1545e-01   0.9055
(Intercept):4  1.7467e+00 4.2092e-01   4.1496
x              4.1421e-01 1.9762e-01   2.0959
time          -3.6021e-04 3.0387e-05 -11.8540
age           -2.6115e-05 9.2504e-06  -2.8232
sexM           1.0188e-01 1.2491e-01   0.8156

Number of linear predictors:  4

Names of linear predictors:
logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4])

Dispersion Parameter for cumulative family:   1

Residual Deviance: 2475.937 on 3460 degrees of freedom

Log-likelihood: -1237.969 on 3460 degrees of freedom

#######################

So here are my questions:

1) I need to get the t value for x, so I can use "1 - pt(tvalue,1)" to find some 
sort of probability value for x. That's not trivial. Additionally, I assume df 
for x is 1, hence I plan to use  "1 - pt(tvalue,1)", though I might well be 
wrong. In any case getting the darned t value seems impossible

2) because of the difficulty of getting (1), it there a way of getting vglm() to 
spit out a p-value for x please?

I do recon many people might scoff at my crass desire for a p-value, but I'm 
dealing with some dire phenotype in a whole genome analysis where the *only* 
thing that matters are p-values. I have to be quite unsophysticated I'm afraid.

Best,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602     Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com