[Rd] Subsetting issue in model.frame with na.omit

Trevor John Hastie hastie at stanford.edu
Mon Sep 19 22:13:29 CEST 2016


Running R version 3.3.1 (2016-06-21) Bug in Your Hair

I have discovered an issue with model.frame() with regard to its
implementation of the na.action argument. This impacts the gam package.

We are expecting the last thing to happen in model.frame() is that it
	runs na.action on the frame it has produced.  In the example
	below, we use "na.action=na.omit", which calls for subsetting
	out rows of the frame.  However, when it does this, it does
	not see that there is a [.smooth method for the two columns,
	which are of S3 class "smooth". So it does do the subsetting,
	but does not use the subset methods. In my example, this is
	evidenced by the attribute element $NAs of (each) of
	the components still being present.

When instead, I use "na.action=na.pass" in the call to model.frame,
and then filter the resulting frame through na.omit(), it does the right thing.
The $NAs component has disappeared, which is what should have
happened here.

set.seed(101)
n=30
x=matrix(runif(n*2),n,2)
x[sample(1:20,6,replace=FALSE)]=NA
dx=data.frame(x)
library(gam)
###Compare
m=model.frame(~s(X1,df=4)+s(X2,df=4),data=dx,na.action=na.omit)
attributes(m[[1]])
###with
m=model.frame(~s(X1,df=4)+s(X2,df=4),data=dx,na.action=na.pass)
m=na.omit(m)
attributes(m[[1]])

------------------------------------------------------------------------------
  Trevor Hastie                                   hastie at stanford.edu  
  Professor, Department of Statistics, Stanford University
  Phone: (650) 725-2231                 Fax: (650) 725-8977  
  URL: http://www.stanford.edu/~hastie  
   address: room 104, Department of Statistics, Sequoia Hall
           390 Serra Mall, Stanford University, CA 94305-4065  
 ------------------------------------------------------------------------------



More information about the R-devel mailing list