[R] Separation issue in binary response models - glm, brglm, logistf

Xochitl CORMON Xochitl.Cormon at ifremer.fr
Wed Feb 27 18:04:42 CET 2013


Dear all,

I am encountering some issues with my data and need some help.
I am trying to run glm analysis with a presence/absence variable as 
response variable and several explanatory variable (time, location, 
presence/absence data, abundance data).

First I tried to use the glm() function, however I was having 2 warnings 
concerning glm.fit () :
# 1: glm.fit: algorithm did not converge
# 2: glm.fit: fitted probabilities numerically 0 or 1 occurred
After some investigation I found out that the problem was most probably 
quasi complete separation and therefor decide to use brglm and/or logistf.

* logistf : analysis does not run
When running logistf() I get a error message saying :
# error in chol.default(x) :
# leading minor 39 is not positive definite
I looked into logistf package manual, on Internet, in the theoretical 
and technical paper of Heinze and Ploner and cannot find where this 
function is used and if the error can be fixed by some settings.

* brglm : analysis run
However I get a warning message saying :
# In fit.proc(x = X, y = Y, weights = weights, start = start, etastart # 
= etastart,  :
# Iteration limit reached
Like before i cannot find where and why this function is used while 
running the package and if it can be fixed by adjusting some settings.

In a more general way, I was wondering what are the fundamental 
differences of these packages.

I hope this make sense enough and I am sorry if this is kind of 
statistical evidence that I'm not aware of.

It is my first time asking a question so I apologize if it's not like it 
should be and kindly ask you to not hesitate to let me know about it.

Thank you for your help

Xochitl C.

-----------------------------------------------------------------------

Here an extract of my table and the different formula I run :

 > head (CPUE_table)
   Year Quarter Subarea Latitude Longitude Presence.S CPUE.S Presence.H 
CPUE.H Presence.NP CPUE.NP Presence.BW CPUE.BW Presence.C CPUE.C 
Presence.P CPUE.P Presence.W   CPUE.W
1 2000       1    31F1    51.25       1.5          0      0          0 
     0           0       0           0       0          1 76.002 
   0      0          1 3358.667
2 2000       1    31F2    51.25       2.5          0      0          0 
     0           0       0           0       0          1 12.500 
   0      0          1 3028.500
3 2000       1    32F1    51.75       1.5          0      0          0 
     0           0       0           0       0          1  5.500 
   0      0          1 2256.750
4 2000       1    32F2    51.75       2.5          0      0          0 
     0           0       0           0       0          1 10.000 
   0      0          1  808.000
5 2000       1    32F3    51.75       3.5          0      0          0 
     0           0       0           0       0          1 19.000 
   0      0          1  277.000
6 2000       1    33F1    52.25       1.5          0      0          0 
     0           0       0           0       0          0  0.000 
   0      0          1    2.000

 > tail (CPUE_table)
      Year Quarter Subarea Latitude Longitude Presence.S   CPUE.S 
Presence.H  CPUE.H Presence.NP  CPUE.NP Presence.BW  CPUE.BW Presence.C 
CPUE.C Presence.P CPUE.P Presence.W CPUE.W
4435 2012       3    50F3    60.75       3.5          1  103.000 
   1 110.000           1  2379.00           1   20.000          1  6.000 
          0      0          1 22.000
4436 2012       3    51E8    61.25      -1.5          1 1311.600 
   1  12.000           1  4194.78           0    0.000          1 18.000 
          0      0          0  0.000
4437 2012       3    51E9    61.25      -0.5          1   34.336 
   1  46.671           1 11031.56           1    2.668          1  3.335 
          0      0          1  3.333
4438 2012       3    51F0    61.25       0.5          1  430.500 
   1 148.000           1  1212.22           1 3279.200          1  2.000 
          0      0          1  2.000
4439 2012       3    51F1    61.25       1.5          1  115.000 
   1  85.000           1  2089.50           1    1.000          1 22.000 
          1      2          1 40.000
4440 2012       3    51F2    61.25       2.5          1   72.500 
   1  35.500           1   270.48           1  516.300          1 11.500 
          1      1          1 16.000

logistf_binomPres <- logistf (Presence.S ~ (Presence.BW + Presence.W + 
Presence.C + Presence.NP +Presence.P + Presence.H +CPUE.BW + CPUE.H + 
CPUE.P + CPUE.NP + CPUE.W + CPUE.C + Year + Quarter + Latitude + 
Longitude)^2, data = CPUE_table)

Brglm_binomPres <- brglm (Presence.S ~ (Presence.BW + Presence.W + 
Presence.C + Presence.NP +Presence.P + Presence.H +CPUE.BW + CPUE.H + 
CPUE.P + CPUE.NP + CPUE.W + CPUE.C + Year + Quarter + Latitude + 
Longitude)^2, family = binomial, data = CPUE_table)

-----------------------------------------------------------------------



-- 

<>< <>< <>< <><

Xochitl CORMON
+33 (0)3 21 99 56 84

Doctorante en sciences halieutiques
PhD student in fishery sciences

<>< <>< <>< <><

IFREMER
Centre Manche Mer du Nord
150 quai Gambetta
62200 Boulogne-sur-Mer

<>< <>< <>< <><



More information about the R-help mailing list