[Rd] suggested addition to model.matrix

William Dunlap wdunlap at tibco.com
Tue Oct 4 18:27:19 CEST 2016


In addition, there is a formula method for data.frame that
assumes the first column is the dependent variable.
 > z <- data.frame(X1=1:6,X2=letters[1:3],Y=log(1:6))
 > formula(z)
 X1 ~ X2 + Y
 > colnames(model.matrix(formula(z), z))
 [1] "(Intercept)" "X2b"         "X2c"         "Y"

Spencer's request is that the default formula given to model.matrix have
no dependent variable.
 > colnames(model.matrix(~., z))
 [1] "(Intercept)" "X1"          "X2b"         "X2c"         "Y"

In my opinion, formula.data.frame is a mistake, but we don't need two
incompatible mistakes.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Oct 3, 2016 at 9:46 PM, Fox, John <jfox at mcmaster.ca> wrote:

> Dear Spencer,
>
> I don't think that the problem of "converting a data frame into a model
> matrix" is well-defined, because there isn't a unique mapping from one to
> the other.
>
> In your example, you build  the model matrix for the additive formula ~ a
> + b from the data frame matrix containing a and b, using "treatment"
> contrasts, but there are other possible formulas (e.g., ~ a*b) and
> contrasts [e.g., model.matrix(~ a + b, dd, contrasts=list(a=contr.sum,
> b=contr.helmert)].
>
> So I think that the current approach is sensible -- to require both a data
> frame and a formula.
>
> Best,
>  John
>
> > -----Original Message-----
> > From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of
> Spencer
> > Graves
> > Sent: October 3, 2016 7:59 PM
> > To: r-devel at r-project.org
> > Subject: [Rd] suggested addition to model.matrix
> >
> > Hello, All:
> >
> >
> >        What's the simplest way to convert a data.frame into a
> model.matrix?
> >
> >
> >        One way is given by the following example, modified from the
> examples in
> > help(model.matrix):
> >
> >
> > dd <- data.frame(a = gl(3,4), b = gl(4,1,12))
> > ab <- model.matrix(~ a + b, dd)
> > ab0 <- model.matrix(~., dd)
> > all.equal(ab, ab0)
> >
> >
> >        What do you think about replacing "model.matrix(~ a + b, dd)" in
> > the current help(model.matrix) with this 3-line expansion?
> >
> >
> >        I suggest this, because I spent a few hours today trying to
> > convert a data.frame into a model.matrix before finding this.
> >
> >
> >        Also, what do you think about adding something like the following
> > to the stats package:
> >
> >
> > model.matrix.data.frame <- function(object, ...){
> >      model.matrix(~., object, ...)
> > }
> >
> >
> >        And then extend the above example as follows:
> >
> > ab. <- model.matrix(dd)
> > all.equal(ab, ab.)
> >
> >
> >        Thanks,
> >        Spencer Graves
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list