[R] Help with GLM starting values in user defined link function

Andrew.Hoskins at csiro.au Andrew.Hoskins at csiro.au
Fri Oct 24 05:10:30 CEST 2014


Hi Ken, 

Many thanks for your advice. 

Earlier this morning I stepped my way through the glm.fit function to see where things were falling over and realised that first and foremost I had my link function wrong (link and inverse were back to front). I've now fixed this and can get the model to produce coefficients following your example. However, once I try to fit the complete model (where eco is offset rather than estimated), I still end up with an error that the model was unable to fit coefficients. Although it does seem to make it though some iterations of the IRLS algorithm before this happens. I also end up with the same error when I first estimate the coefficient for eco and then try to offset the estimated eco coefficient*eco, which seems to suggest to me that there is something not right with how the link is working with offset. 

See below for the example code. 

I was wondering if you or anyone else has some more great advice? 

Thanks in advance. 

Andrew 

## Example fit of negative exponential binomial GLM 

## define link function 
negexp <- function() 
{ 
    linkfun <- function(mu) -log(1-mu) 
    linkinv <- function(eta) 1-exp(-eta) 
    mu.eta <- function(eta) exp(-eta) 
    valideta <- function(eta) all(is.finite(eta)) 
    link <- paste0("negexp") 
    structure(list(linkfun = linkfun, linkinv = linkinv, 
                   mu.eta = mu.eta, valideta = valideta, name = link), 
              class = "link-glm") 
} 

## create some data 

y <- c(0,0,0,0,0,1,1,1,0,1,0,1,1,0,0,1,1,1,1,0,1,0,1,0,1,0,1,1,1,0,1,0,1,0,0,1,0 
                ,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,1,1,0,1,0,1,1,0,1,0,1,0,1,1 
                ,0,1,0,0,1,0,0,1,1,1,0,0,1,1,1,1,0,0,0,1,0,1,0,1,0,0) 


eco <- c(0.30406146,0.77127034,0.27507740,0.34660849,0.10496959,0.87483283,1.34652163,0.59570289,0.76557945 
                ,1.07105129,0.85681582,1.15885519,0.62478718,0.82327890,0.61921331,1.40337615,1.69376337,0.96662890 
                ,0.62756558,1.22148480,0.29509170,1.20822702,1.04490241,0.63994034,0.44537652,0.80908805,1.38793219 
                ,0.68987695,0.65253113,0.10996619,2.18030035,0.95187860,1.91719194,0.55910638,0.42246265,0.99747093 
                ,0.65609015,1.56408171,0.09024976,0.49430176,0.89651639,0.13943031,0.72264673,0.33212781,0.53156567 
                ,0.24478163,0.20439708,0.26577897,0.73061755,1.41380646,0.45361391,0.53961802,0.20099582,0.16278695 
                ,0.51188479,1.23152701,1.45180489,0.16136045,0.84696597,1.06556860,0.31352700,7.54728452,0.47765713 
                ,1.62966928,0.51514442,0.87203787,0.33515181,0.71407043,0.84767445,0.33640927,0.70331392,0.41617675 
                ,1.41137914,0.22586531,0.92797131,1.34627407,0.21408341,0.38903027,0.91690877,0.95946623,0.46114617 
                ,0.62965571,7.50492235,1.96516642,0.61555184,1.24061426,0.95281453,1.02729643,1.44581350,1.63148077 
                ,1.02291891,0.80319545,0.92136436,1.22428318,0.59172977,1.56985149,0.35790202,2.23402940,0.98565537 
                ,0.41658919) 

geog <- c(254.94615,277.78675,3.69047,47.92363,0.90241,449.44532,1795.89910,985.66843,1063.47287 
                ,160.27883,58.72738,1093.00270,1423.51194,1232.16769,54.56121,4772.54353,1877.95322,1110.18161 
                ,174.53805,2829.67281,551.22870,1781.67608,495.13007,44.42326,1057.72959,783.99003,3025.58558 
                ,1855.59219,1715.27590,41.75478,3687.95693,2125.34324,751.42284,1598.04527,1625.35627,848.40949 
                ,1484.40835,3332.15554,214.99678,136.60188,1388.07919,77.21198,2366.56327,617.31749,1421.72213 
                ,537.38636,223.57615,256.35456,3022.63678,4783.64718,45.97153,194.79090,2.65647,69.08392 
                ,948.66990,1480.70503,2805.30205,144.55345,1134.92666,728.04570,1421.45250,827.57959,1517.75701 
                ,682.77014,1060.09369,448.44398,848.64842,1437.74925,2887.23135,56.28056,725.78408,91.19194 
                ,1905.87208,749.92265,261.19075,2529.80027,371.16338,1130.14904,802.23348,1851.86105,1274.20599 
                ,260.79728,1427.11459,3891.82373,482.58143,2011.86414,1310.10546,975.37470,1087.50127,2195.28667 
                ,2358.70761,44.82955,1553.35558,2261.60567,1216.64486,1674.70189,165.13405,1463.93362,1542.33074 
                ,1683.01992) 

testDat <- data.frame(y,eco,geog) 

## fit model 
fit.logit <- glm(y~eco+geog,family=binomial(cloglog),data=testDat) 
##EV <- negexp()$linkfun(fitted(fit.logit)) ## linking the fitted values doesn't work but ignoring this will 
LE.lm <- lm(EV ~ eco + geog, testDat)     
Ec <- coef(LE.lm) 
glm(y ~ eco + geog, family = binomial(negexp()),data=testDat,start=Ec) 


## fit model using offset 

fit.logit <- glm(y~offset(eco)+geog,family=binomial(cloglog),data=testDat) 
##EV <- negexp()$linkfun(fitted(fit.logit)) ## linking the fitted values doesn't work but ignoring this will 
LE.lm <- lm(EV ~ offset(eco) + geog, testDat)     
Ec <- coef(LE.lm) 
glm(y ~ offset(eco) + geog, family = binomial(negexp()),data=testDat,start=Ec) 

## fit model using offset with eco*coefficient * eco 

fit.logit <- glm(y~offset(coef(fit2)[2]*eco)+geog,family=binomial(cloglog),data=testDat) 
##EV <- negexp()$linkfun(fitted(fit.logit)) ## linking the fitted values doesn't work but ignoring this will 
LE.lm <- lm(EV ~ offset(coef(fit2)[2]*eco) + geog, testDat)     
Ec <- coef(LE.lm) 
fit2 <- glm(y ~ offset(coef(fit2)[2]*eco) + geog, family = binomial(negexp()),data=testDat,start=Ec)

-----Original Message-----
<Andrew.Hoskins <at> csiro.au> writes: 
> I'm trying to fit a binomial GLM with user defined 
 link 
 function (negative exponential), however I seem to 
> be unable to find the correct starting values to 
initialise such a model. I've tried taking starting 
> values from a logistic and log models fit to the 
same data and also tried to substitute the intercept 
from 
> the null model in as the starting value for this 
model, however all continue to return the same error. 
> 
> Andrew 
> 
> ## Example fit of negative exponential binomial 
GLM

> 
> ## define link function 
> negexp <- function() 
> { 
>     linkfun <- function(mu) 1-exp(-mu) 
>     linkinv <- function(eta) -log(1-eta) 
>     mu.eta <- function(eta) 1/(1-eta) 
>     valideta <- function(eta) TRUE 
>     link <- paste0("negexp") 
>     structure(list(linkfun = linkfun, linkinv = linkinv, 
>                    mu.eta = mu.eta, valideta = valideta,
... [show rest of quote]
name = link), 
>               class = "link-glm") 
> } 
> 
---SNIP--- 

Take a look at the limits of eta for the extreme values 
of mu and compare them with the linear predictor of 
your link applied to say the fitted values of your logit 
fit.  It seems to suggest that two values fall outside 
the range of valid eta, according to your linkfun: 
c(62, 83).  I got it to work 
with these removed although there were lots of other 
warnings that you might have to worry about. 
Also, when choosing start values you might want 
to base them on a fit with your link rather than 
a different one. So, I got start values by trying 

EV <- negexp()$linkfun(fitted(fit.logit)) 
LE.lm <- lm(EV ~ eco + geog, testDat)     
Ec <- coef(LE.lm)     

with these defined as in your mail (sorry I snipped 
your code out).   
So, I found 
which(fitted(LE.lm) > (1 - exp(-1))) 
62 83 
62 83 

and then 

glm(y ~ eco + geog, family = binomial(negexp()), 
       data = testDat[-c(62, 83), ], start = Ec)   


Coefficients: 
(Intercept)          eco         geog   
  1.593e-01    2.085e-01    4.713e-06   

Degrees of Freedom: 97 Total (i.e. Null);  95 Residual 
Null Deviance:	   134.4 
Residual Deviance: 112.3 AIC: 118.3 
There were 27 warnings (use warnings() to see them) 

HTH 

> 
> Andrew Hoskins 
> Postdoctoral reasearch fellow 
> Ecosystem Sciences 
> CSIRO 
> 
> E Andrew.Hoskins <at> csiro.au T +61 2 6246 5902 
> Black Mountain Laboratories 
> Clunies Ross Street, Acton, ACT 2601, Australia 
> www.csiro.au 
>
... [show rest of quote]

> 

-- 
Kenneth Knoblauch 
Inserm U846 
Stem-cell and Brain Research Institute 
Department of Integrative Neurosciences 
18 avenue du Doyen Lépine 
69500 Bron 
France 
tel: +33 (0)4 72 91 34 77 
fax: +33 (0)4 72 91 34 61 
portable: +33 (0)6 84 10 64 10 
http://www.sbri.fr/members/kenneth-knoblauch.html



More information about the R-help mailing list