[Rd] [R-pkg-devel] Guidelines for S3 regression models

Achim Zeileis Achim.Zeileis at R-project.org
Fri Jun 26 16:41:39 CEST 2015


Stephen,

thanks for your effort. The more appropriate list for this discussion is 
probably R-devel (as far as I understand it) so I've moved the discussion 
there.

Related topics have already been discussed in the past. Specifically, I 
remember contributions by Paul Johnson ("rockchalk" package) and John Fox 
("effects" and "car" package) as their packages also provide generic 
infrastructure for visualizing models and carrying out inference. I have 
also some related packages such as "lmtest", "sandwich", "strucchange", or 
"multcomp". Exporting tables of regression coefficients in a modular way 
via "texreg" or "memisc" could also be added.

> Once we have built a regression model, we typically want to use the 
> model for further processing, such as making predictions from the model 
> or plotting the residuals.  Unfortunately, for many packages on CRAN 
> this can be difficult.
>
> For example, some models don't have a residuals method and don't save 
> the call or data --- so you can't tell how to generate the residuals 
> from the model object itself.
>
> A common snag is that for some models the new data for predict() has to 
> be a matrix; for others it has to be a data.frame.  This places an 
> unnecessary burden on the user when both data.frames and matrices can 
> easily be supported by predict.
>
> To mitigate such issues, I'm going out on a limb and presenting some 
> guidelines for writers of S3 regression model functions (this document 
> is currently part of the plotmo package):

I think this is a nice and useful starting point. It's probably not 
comprehensive (yet) but will surely help.

You could add something more about writing the formula interface and the 
correct processing of model.frame, terms, model.response, model.matrix, 
model.weights, model.offset. Especially for models with linear predictors 
the latter two can be very useful and are often not hard to implement. In 
case the model has multiple parts or multiple responses, the "Formula" 
package (and its vignette) might also be helpful.

As for the S3 methods, I would omit coefficients, fitted.values, and resid 
from the list. These dispatch to coef, fitted, and residuals anyway. For 
inference it would also be very useful to add nobs(), df.residual(), 
vcov(), and logLik() and/or deviance() where applicable. An overview which 
lists some (but not all) useful methods is in Table 1 of 
vignette("betareg", package = "betareg").

For coef() and vcov() it is useful/important that the names and dimension 
match. Then Wald tests can be easily computed in functions like 
car::linearHypothesis(), car::deltaMethod(), lmtest::waldtest(), or 
lmtest::coeftest().

Thanks & best wishes,
Achim

> http://www.milbo.org/doc/modguide.pdf
>
> Your comments would be appreciated.
>
> Stephen Milborrow
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
>



More information about the R-devel mailing list