[R] confidence interval for the logit - predict.glm

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed May 28 16:54:22 CEST 2008


Possibly your calculation overflows: exp(upperlogit)/(1+exp(upperlogit)) 
could be replaced by 1/(1+exp(-upperlogit)), or even better by 
plogis(upperlogit).  This could happen via the Hauck-Donner effect: the 
fitted probabilities are very near one and the standard errors are very 
large.

As for your other points, please follow the posting guide and
'provide commented, minimal, self-contained, reproducible code'.

On Wed, 28 May 2008, Christine Sonvilla wrote:

>
> Hello all,
>
> I've come across an online posting
>
> http://www.biostat.wustl.edu/archives/html/s-news/2001-10/msg00119.html
>
> that described how to get confidence intervals for predicted values from 
> predict.glm. These instructions were meant for S-Plus. Yet, it generally 
> seems to work with R too, but I am encountering some problems. I am 
> explaining my procedure in the following and would be most grateful if 
> anyone could reply to me on that.
>
> I have run glm models to predict fish species from a number of 
> environmental predictors, using a stepwise approach (I am aware that 
> there is some controversy going on about this). I have got a data set 
> that is called "testfishes.txt", "predy.txt" contains the predictors for 
> the testfishes only and "predallx.txt" contains all predictors for all 
> planning units.
>
> I used the following commands: "stepmodfi" is the output of the stepwise approach
>
> predictionlogit <- predict(stepmodfi, predallx, type="link", se.fit=TRUE)
>
> upperlogit <- predictionlogit$fit + 2*predictionlogit$se.fit
> lowerlogit <- predictionlogit$fit - 2*predictionlogit$se.fit
>
> upperresponse[,i] <- exp(upperlogit)/(1+exp(upperlogit))
> lowerresponse[,i] <- exp(lowerlogit)/(1+exp(lowerlogit))

What is 'i' here?

> predictionresponse <- predict(stepmodfi, predallx, type="response", se.fit=FALSE)
> fishoccur[,i] <- predictionresponse
>
>
> type="link" should be the equivalent to the S-Plus version of type="1", 
> which was indicated in the online posting explaining this procedure.
>
> So I first tried to get predictions on the logit scale and there is 
> already the point where I am a bit bewildered. Predictionlogit would 
> only result in ONE value for every single planning unit contained in 
> predallx.txt, when actually I would assume that there would be "planning 
> unit times fishes" predicted values - at least this is what I get from 
> predictionresponse. Predictionresponse generates as many predicted 
> values as there are planning units times fishes (1680 planning units x 
> 205 fish species, whihc are predicted over all these planning units), 
> which I put into a constructed matrix named "fishoccur". In fact this is 
> what I did in the very beginning, generating predicted values with the 
> command predictionresponse <- predict(stepmodfi, predallx, 
> type="response", se.fit=FALSE). Then I wanted to construct confidence 
> intervals around these values. Therefore I first generated the logit 
> predicted values, but that already is the point I am unsure about.
>
> The problem in fact is, that quite some of my upper values of the 
> confidence interval result in NAs. The lower interval seems fine. Yet, I 
> am not even sure if my approach is correct, regarding this single value 
> that is being produced by the prediction on the logit scale.
>
> I hope that my descriptions were clear enough and would greatly 
> appreciate if anyone could suggest how to tackle this, whether the 
> approach itself is fine (which I believe indeed) and how to get proper 
> lower and upper confidence values.
>
> Many thanks in advance!
>
> Christine
>
> -- 
> Super-Aktion nur in der GMX Spieleflat: 10 Tage für 1 Euro.
> Über 180 Spiele downloaden: http://flat.games.gmx.de
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list