[R] discrepancy between R & Splus lm.influence() for family=Gamma
maechler at stat.math.ethz.ch
Fri Sep 12 09:16:49 CEST 2003
>>>>> "Andrew" == Andrew Hill <AHill at wyeth.com>
>>>>> on Thu, 11 Sep 2003 17:16:30 -0400 writes:
Andrew> Hello, I am looking for an explanation and/or fix
Andrew> for a discrepancy in the behaviour of the R
Andrew> lm.influence() function [ version R 1.5.0
Andrew> (2002-04-29) ] and the same function in Splus [
Andrew> Splus version 5.1 release 1, running on SGI IRIX
Andrew> 6.2]. The discrepancy is of concern because I am
Andrew> migrating some Splus scripts to R and need to ensure
Andrew> consistency of results.
Before reading on:
Do you really mean R 1.5.0?
If yes, you should definitely upgrade to R 1.7.1 !
There were considerable improvements (for R 1.7.0) for these
functions, mostly thanks to John Fox, and the recommended way in
R is to use influence() which is a generic function that has both an "lm"
and (important for you!) a "glm" method.
I.e., you use influence(mylmfit, ...) and the method
influence.glm(mylmfit, ...) will be called.
This should do the correct calculations for all kind of glm models.
Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
Andrew> Specifically, when I fit a glm() model to a test
Andrew> dataset using the family = Gamma(link=identity), and
Andrew> then call lm.influence on the fitted glm object, the
Andrew> resulting lm.influence()$coefficients and
Andrew> lm.influence()$sigma values are different between R
Andrew> and Splus versions. The lm.influence()$hat vector
Andrew> does agree between the two programs. Also, the
Andrew> glm() function does return the same model
Andrew> coefficient in both R and Splus.
Andrew> In contrast, if I use the default glm
Andrew> family=Gaussian(link=identity), all output of
Andrew> lm.influence() for both R and Splus does agree fully
Andrew> for my dataset.
Andrew> I have read the R help function for lm.influence()
Andrew> and I understand that R returns the difference
Andrew> between the model coefficients and the drop-one
Andrew> coefficients, while Splus returns the drop-one
Andrew> coefficients. But this does not account for the
Andrew> discrepancy that I see in the
Andrew> lm.influence$coefficients, nor the difference in
Andrew> lm.influence$sigma, at least to my understanding.
Andrew> Pasted below is output, first from R, and second
Andrew> from Splus, which illustates the issue.
Andrew> Discrepancies between the R and Splus $sigma values
Andrew> look like ~ 2-6%. Hopefully I have not overlooked
Andrew> an obvious statistical explanation for the
Andrew> Thanks, Andrew.
More information about the R-help