[Rd] model.matrix and na.action

Ben Bolker bbolker at gmail.com
Wed Apr 29 21:07:45 CEST 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

  I've finally been able to piece this together, but I wonder if I've
got it right/whether there is anywhere the behaviour of `model.matrix`
with respect to `na.action` is more *explicitly* documented.

  * model.matrix() respects the 'na.action' argument associated with
its data.
  * If the 'data' argument is a model frame with an "na.action"
attribute, then that is used.
  * If the 'data' argument is _not_ a model frame (which does go
against the implicit suggestion of ?model.matrix), model.frame() is
used on the data, which means that by default the global na.option
setting is used.
 * the intended design is that one should first construct the model
frame using an explicit `na.action` and then pass it to `model.matrix`.

  (After spending a few hours figuring this out and constructing the
e-mail, it has turned from a question into a request for confirmation
... I do think a couple of extra sentences of explication in the
documentation for dummies like me wouldn't hurt, I would be happy to
submit a documentation patch if that seems worthwhile.)

- --------
  I've tried looking through model.matrix.default and through the
modelmatrix function in src/library/stats/src/model.c , but it's
pretty hairy ...

  Related discussion:

http://stackoverflow.com/questions/5616210/model-matrix-with-na-action-null

http://stackoverflow.com/questions/6447708/model-matrix-generates-fewer-rows-than-original-data-frame

https://stat.ethz.ch/pipermail/r-help/2008-December/183509.html

https://stat.ethz.ch/pipermail/r-help/2001-August/014483.html (BDR
says here "?model.matrix does tell you the second argument should be
the result of model.frame, which is a pretty strong hint." ...)

==========


mm <- function(newdata,form=~x,na.action=na.pass,set.opts=FALSE) {
    if (set.opts) {
        op <- options(na.action=na.action)
        on.exit(options(op))
    }
    ## try with raw data and with model.frame with na.action specified
    X1 <- model.matrix(form, mfnew <- model.frame(newdata,
                                                 na.action=na.action))
    X2 <- model.matrix(form, newdata)
    return(c(any(is.na(X1[,"x"])),any(is.na(X2[,"x"]))))
}

options("na.action")  ## na.omit
d <- data.frame(x=c(NA,NA,1:5))
mm(d)  ## TRUE FALSE
mm(d,set.opts=TRUE)  ## TRUE TRUE
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJVQSwBAAoJEOCV5YRblxUHr8gIAIEUEuZ0nbNQGmslpnEuLEiB
mdGVemWFXSUzs/+267GxBj5LvIi3SqOfYe6nMPd6VPHB8HSAzl3Spln+6a13U566
sgNq6dmqApDOjTNGklskA1VcjPHGMx3AOANjGnObQUfLti8G+y+CYV6NnnzoT23q
eeBUobwDqs/nfWkgiQcPY2iVQYGs6q03S4jJtyFkJgs3Wqn6croIXwUFAZIsjvmp
wf6BxvFFZEtAkDHdO3nC/LtOjkeh/TBnvXjzmfI9jlyiI0wkLrdd4hoXt3TmL94y
L3nXvHf0Ntb74Gyjg9o4dGU3Gl6iZTRsW7Dqbz9PdYOWGUnQ/t5BftO3dOpKvHU=
=GZR8
-----END PGP SIGNATURE-----



More information about the R-devel mailing list