[R] A Tip: lm, glm, and retained cases

hadley wickham h.wickham at gmail.com
Wed Aug 27 01:49:37 CEST 2008


On Tue, Aug 26, 2008 at 6:45 PM, Ted Harding
<Ted.Harding at manchester.ac.uk> wrote:
> Hi Folks,
> This tip is probably lurking somewhere already, but I've just
> discovered it the hard way, so it is probably worth passing
> on for the benefit of those who might otherwise hack their
> way along the same path.
>
> Say (for example) you want to do a logistic regression of a
> binary response Y on variables X1, X2, X3, X4:
>
>  GLM <- glm(Y ~ X1 + X2 + X3 + X4)
>
> Say there are 1000 cases in the data. Because of missing values
> (NAs) in the variables, the number of complete cases retained
> for the regression is, say, 600. glm() does this automatically.
>
> QUESTION: Which cases are they?
>
> You can of course find out "by hand" on the lines of
>
>  ix <- which( (!is.na(Y))&(!is.na(X1))&...&(!is.na(X4)) )
>
> but one feels that GLM already knows -- so how to get it to talk?
>
> ANSWER: (e.g.)
>
>  ix <- as.integer(names(GLM$fit))

Alternatively, you can use:

attr(GLM$model, "na.action")

Hadley

-- 
http://had.co.nz/



More information about the R-help mailing list