[R] dreaded p-val for d^2 of a glm / gam

Fri Mar 28 15:49:21 CET 2008

      Do you think p values are bad?  That's not my understanding.  P 
values may not be reported by some software, because the algorithm 
developers didn't know how to efficiently compute a reasonably accurate 
p value.  And if you do a thousand or a million statistical tests and 
report only the results with the smallest p value, that's ultimately 
fraudulent.  (http://en.wikipedia.org/wiki/Multiple_comparisons). 

      However, the concept of a significance probability or p value is 
quite valuable (http://en.wikipedia.org/wiki/P-value), though like any 
tool or concept, it can be misused. 

      Hope this helps. 
      Spencer Graves

Monica Pisica wrote:
> Hi Spencer and David,
>  
> Thanks for your answers .... first - yes it is deviance but just 
> before i just spoke and explain that it is the equivalent of r square 
> from the "normal" regression.....
>  
> I hope i can do the comparison and show that the model is significant 
> and hopefully i am off the hook. Sincerely i try to avoid all this 
> business with p-values but certainly some are quite found of it. The 
> problem is that you get almost by default a p-value from an F test if 
> you use lm for example, so ..... quite few times i was asked to 
> provide a similar thing for quite different models.
>  
> Thanks again,
>  
> Monica
>
> > Date: Thu, 27 Mar 2008 16:38:12 -0700
> > From: spencer.graves at pdf.com
> > To: pisicandru at hotmail.com
> > CC: r-help at r-project.org
> > Subject: Re: [R] dreaded p-val for d^2 of a glm / gam
> >
> > I assume you mean 'deviance', not 'squared deviance'; if the
> > latter, then I have no idea.
> >
> > If the former, then a short and fairly quick answer to your
> > question is that 2*log(likelihood ratio) for nested hypotheses is
> > approximately chi-square with numbers of degrees of freedom = the 
> number
> > of parameters in the larger model fixed to get the smaller model, under
> > standard regularity conditions, the most important of which is that the
> > maximum likelihood is not at a boundary.
> >
> > For specificity, consider the following modification of the first
> > example in the 'glm' help page:
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
> > glm.D93t <- glm(counts ~ treatment, family=poisson())
> > anova(glm.D93t, glm.D93, test="Chisq")
> >
> > The p-value is not printed by default, because some people would
> > rather NOT give an answer than give an answer that might not be very
> > accurate in the cases where this chi-square approximation is not very
> > good. To check that, you could do a Monte Carlo, refit the model with,
> > say, 1000 random permutations of your response variable, collect
> > anova(glm.D93t, glm.D93)[2, "Deviance"] in a vector, and then find out
> > how extreme the deviance you actually got is relative to this
> > permutation distribution.
> >
> > Hope this helps.
> > Spencer Graves
> > p.s. Regarding your 'dread', please see fortune("children")
> >
> > Monica Pisica wrote:
> > > OK,
> > >
> > > I really dread to ask that .... much more that I know some 
> discussion about p-values and if they are relevant for regressions 
> were already on the list. I know to get p-val of regression 
> coefficients - this is not a problem. But unfortunately one editor of 
> a journal where i would like to publish some results insists in giving 
> p-values for the squared deviance i get out from different glm and gam 
> models. I came up with this solution, but sincerely i would like to 
> get yours'all opinion on the matter.
> > >
> > > p1.glm <- glm(count ~be+ch+crr+home, family = 'poisson')
> > >
> > > # count - is count of species (vegetation)
> > > # be, ch, crr, home - different lidar metrics
> > >
> > > # calculating d^2
> > > d2.p1 <- round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4)
> > > d2.p1
> > > 0.6705
> > >
> > > # calculating f statistics with N = 148 and n=4; f = 
> (N-n-1)/(N-1)(1-d^2)
> > > f <- (148-4-1)/(147*(1-0.6705))
> > > f
> > > [1] 2.952319
> > >
> > > #calculating p-value
> > > pval.glm <- 1-pf(f, 147,143)
> > > pval.glm
> > > [1] 1.135693e-10
> > >
> > > So, what do you think? Is this acceptable if i really have to give 
> a p-value for the deviance squared? If it is i think i will transform 
> everything in a fuction ....
> > >
> > > Thanks,
> > >
> > > Monica
> > > _________________________________________________________________
> > > Windows Live Hotmail is giving away Zunes.
> > >
> > > M_Mobile_Zune_V3
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
>
> ------------------------------------------------------------------------
> Watch “Cause Effect,” a show about real people making a real 
> difference. Learn more. 
> <http://im.live.com/Messenger/IM/MTV/?source=text_watchcause>