[R] How shall one present LRT test statistic in a scientific journal ?

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Nov 26 22:09:57 CET 2009


David Winsemius wrote:
> 
> On Nov 26, 2009, at 12:46 PM, Peter Dalgaard wrote:
> 
>> David Winsemius wrote:
>>>
>>> On Nov 26, 2009, at 12:14 PM, JVezilier wrote:
>>>
>>>>
>>>> Hello !!
>>>>
>>>> I'm recently having a debate with my PhD supervisor regarding how to
>>>> write
>>>> the result of a likelihood ratio test in an article I'm about to 
>>>> submit.
>>>>
>>>> I analysed my data using "lme" mixed modelling.
>>>>
>>>> To get some p-values for my fixed effect I used model simplification
>>>> and the
>>>> typical output R gives looks like this:
>>>>
>>>> model2 = update ( model1,~.-factor A)
>>>> anova (model1, model2)
>>>>
>>>>      Model df       AIC             BIC         logLik         Test
>>>> L.Ratio     p-value
>>>> model 1     1 26  -78.73898   15.29707     65.36949
>>>> model 2     2 20  -73.70539   -1.36997     56.85270   1 vs 2    
>>>> 17.03359
>>>> 0.0092
>>>>
>>>> I thought about presenting it very simply copying/pasting R table and
>>>> writing it like: "factor A had a significant effect on the response
>>>> variable
>>>> (Likelihood ratio test, L-ratio = 17.033, p = 0.0092)"
>>>>
>>>> But my boss argued that it's too unusual (at least in our field of
>>>> evolutionary biology) and that I should present instead the LR 
>>>> statistic
>>>> together with the corresponding Chi^2 statistic since the likelihood
>>>> ratio
>>>> is almost distributed like a Chi2 (df1-df2), and then write down the
>>>> p-value
>>>> corresponding to this value of Chi.
>>>>
>>>> I looked up in the current litterature but cannot really find a proper
>>>> answer to that dilmena.
>>>>
>>>> So, dear evolutionary biologists R users, how would you present it ?
>>>
>>> I am not an evolutionary biologist, but presumably your supervisor is
>>> one. Why are you picking a fight not only with him but with your
>>> prospective audience when there is no meaningful difference? Here is the
>>> p-value you would get with his method:
>>>
>>>>> 1-pchisq( 2*(65.36949 -  56.85270), df=6)
>>> [1] 0.009160622
>>>
>>
>> As I understood the question, it *is* purely formalistic. I.e., what to
>> write, not what to do.
>>
>> I'd say "L-ratio" is plain wrong, since this is not a ratio, but the log
>> of a ratio. "-2lnQ" or "-2logQ" is what my old teachers would write, but
>> pragmatically, I'd expect the best chances with editors and reviewers to
>> be "LRT: chi-square=17.03, df=6, p=0.092", possibly with LRT spelled
>> out. (Some journals like to have the df because it allows reviewers to
>> catch glaring mistakes like categorical variables treated as numeric.)
> 
> I wonder about the phrase "used model simplification". Wouldn't that 
> raise a question about the proper degrees of freedom to use? If terms 
> were dropped from the model based simply on the basis of 
> "non-significance" shouldn't there be some appropriate penalization of 
> subsequent tests of significance?

Absolutely.  At the least, the unbiased estimate of sigma^2 from the 
fullest model fit should be inserted into sigma^2 for the model used. 
More severe corrections are probably warranted though.

Frank
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list