[R] why does glm.predict give values over 1 ?

Tue Nov 1 00:32:49 CET 2005

Hi Rohit:

On 31-Oct-05 Rohit Singh wrote:
> Hi Ted,
>  So here's what I'm doing:
> 
> This is my call to predict.glm:
> 
>> pY <- predict.glm(from69.fin.glm, newdata=d.tab, type="response")
> 
> This is what the fitted glm object looks like:
> 
>> from69.fin.glm
> 
> Call:  glm(formula = TR ~ z1 + e12_div_p_n + z2 + p_n, data = j2.tab)

***>>> It looks as though you have omitted the "family" parameter.
The default is "gaussian" (see "?glm") but for logistic regression
you need 'family = "binomial"',the default link for "binomial"
being "logit" which is correct for logistic regression. If you
were doing probit analysis, for example then you would need
to specify 'family=binomial(link="probit")'. So your call to glm
should look like

from69.fin.glm <- glm(TR ~ z1 + e12_div_p_n + z2 + p_n,
                      data = j2.tab,
                      family=binomial)

Try that -- it should be OK this time! (I think your call to
predict.glm looks all right, provided the datafram d.tab is
of the correct structure for your data).

Best wishes,
Ted.

> Coefficients:
> (Intercept)           z1  e12_div_p_n           z2          p_n
>   0.0462932    0.0063221   -0.0202138    0.0063221    0.0004168
> 
> Degrees of Freedom: 137 Total (i.e. Null);  133 Residual
> Null Deviance:      34.32
> Residual Deviance: 21.93        AIC: 149.8
> 
> 
> This is an example of what the data file looks like
> 
> TR  s_n p_n z1 z2 z1_div_s_n z2_div_s_n z1_div_p_n z2_div_p_n e1 e2
> e1_div_s_n e2_div_s_n e1_div_p_n e2_div_p_n e12 e12_div_s_n e12_div_p_n
> 0 169.000 167.141 8.800 3.800 0.052 0.022 0.053 0.023 -2295.000
> -4007.000 -13.580 -23.710 -13.731 -23.974 0.000 0.000 0.000
> 1 615.500 615.352 29.700 21.800 0.048 0.035 0.048 0.035 -5344.000
> -4248.000 -8.682 -6.902 -8.684 -6.903 141.740 0.230 0.230
> 0 409.500 388.149 5.400 19.000 0.013 0.046 0.014 0.049 -6328.000
> -4597.000 -15.453 -11.226 -16.303 -11.843 1069.890 2.613 2.756
> 0 782.500 776.276 26.100 28.800 0.033 0.037 0.034 0.037 -1279.000
> 1260.000 -1.635 1.610 -1.648 1.623 67.500 0.086 0.087
> 1 355.500 355.117 28.800 32.400 0.081 0.091 0.081 0.091 -10600.000
> -9670.000 -29.817 -27.201 -29.849 -27.230 418.560 1.177 1.179
> 0 184.500 164.012 4.900 9.500 0.027 0.051 0.030 0.058 -4519.000
> -1901.000 -24.493 -10.304 -27.553 -11.591 -963.600 -5.223 -5.875
> 
> 
> 
> Thanks,
> rohit
> 
> On Mon, 31 Oct 2005 Ted.Harding at nessie.mcc.ac.uk wrote:
> 
>> On 31-Oct-05 Rohit Singh wrote:
>> > Hi,
>> >
>> >  This is a newbie question. I have been using glm to perform some
>> > logistic regression. However, if I take the fitted parameters (as
>> > part of the glm object) and pass them on the glm.predict function,
>> > for some test cases I am getting predicted values that are a little
>> > over 1.  This is a bit puzzling for me, because my understanding
>> > was that these numbers are probabilities and so should be between
>> > 0 and 1.
>> >
>> > Thanks a lot! I'd appreciate any help you could provide.
>> >
>> > -rohit
>>
>> Indeed this should not happen, and probably there is some mistake
>> in the way you use the predict function (which requires a little
>> care).
>>
>> However, it's not possible to point-point what is happening
>> without seeing a specific case. Can you post an example of the
>> code you use when this happens? And, if feasible, also an example
>> of data.
>>
>> Best wishes,
>> Ted.
>>
>>
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 31-Oct-05                                       Time: 23:00:39
>> ------------------------------ XFMail ------------------------------
>>

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 31-Oct-05                                       Time: 23:32:45
------------------------------ XFMail ------------------------------