[R] remove all terms with interaction factor in formula

William Dunlap wdunlap at tibco.com
Fri Sep 14 01:21:03 CEST 2012


The fm[,"c"]==1 does not work correctly as the value 2 also
means that the variable (the row) is in the term (the column).
The 2 means that you don't need to apply the contrasts function
to that variable in this term.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Thursday, September 13, 2012 4:15 PM
> To: William Dunlap
> Cc: Bert Gunter; Alexander Shenkin; r-help at r-project.org
> Subject: Re: [R] remove all terms with interaction factor in formula
> 
> 
> On Sep 13, 2012, at 11:53 AM, William Dunlap wrote:
> 
> > Your method would not work for, e.g., "a:d".  You could look at the "factors" attribute
> > of a terms object and select out those columns with non-zero entries for the variables
> > in the interaction of interest.  E.g.,
> >> fm <- attr(terms(~a*b*c*d), "factors")
> >> fm
> >  a b c d a:b a:c b:c a:d b:d c:d a:b:c a:b:d a:c:d b:c:d a:b:c:d
> > a 1 0 0 0   1   1   0   1   0   0     1     1     1     0       1
> > b 0 1 0 0   1   0   1   0   1   0     1     1     0     1       1
> > c 0 0 1 0   0   1   1   0   0   1     1     0     1     1       1
> > d 0 0 0 1   0   0   0   1   1   1     0     1     1     1       1
> >> colnames(fm)[fm["b",]==0 | fm["c",]==0]
> > [1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"
> >
> 
> It's probably a black mark against my abilities to do logic manipulations, but it made a lot
> more sense when I wrote it (admittedly the same meaning)  as :
> 
> colnames(fm)[ !(fm["b",]==1 & fm["c",]==1) ]
> 
> Here's a grepping method that only requires that the order be a.d in any term:
> 
> > as.formula(paste("~", paste(
>       grep("a.+d", attr(terms(~a*b*c*d), "term.labels" ) ,
>            invert=TRUE, value=TRUE), collapse="+") ) )
> ~a + b + c + d + a:b + a:c + b:c + b:d + c:d + a:b:c + b:c:d
> 
> I think that if you are working with a*b*c*d that the order will always be a-before-d.
> 
> --
> David.
> 
> 
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf
> >> Of David Winsemius
> >> Sent: Thursday, September 13, 2012 11:28 AM
> >> To: Bert Gunter
> >> Cc: Alexander Shenkin; r-help at r-project.org
> >> Subject: Re: [R] remove all terms with interaction factor in formula
> >>
> >>
> >> On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:
> >>
> >>> ~ a*b*d + a*c*d
> >>
> >> That seemed pretty clear and obvious, but I started wondering how to tell the
> machine to
> >> do it. Here is another idea:
> >>
> >>> grep("b:c", attr(terms(~a*b*c*d), "term.labels" ) ,invert=TRUE, value=TRUE)
> >> [1] "a"     "b"     "c"     "d"     "a:b"   "a:c"   "a:d"   "b:d"   "c:d"   "a:b:d" "a:c:d"
> >>
> >> (Although I realize it's no longer a formula and might need to be reassembled with
> `paste`
> >> and  `as.formula`.)
> >>
> >> --
> >> David.
> >>
> >>> -- Bert
> >>> On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:
> >>>> Hi Folks,
> >>>>
> >>>> I'm trying to find a way to remove all terms in a formula that contain a
> >>>> particular interaction.
> >>>>
> >>>> For example, in the formula below, I'd like to remove all terms that
> >>>> contain the b:c interaction.
> >>>>
> >>>>> attributes(terms( ~ a*b*c*d))$term.labels
> >>>> [1] "a"       "b"       "c"       "d"       "a:b"     "a:c"
> >>>> [7] "b:c"     "a:d"     "b:d"     "c:d"     "a:b:c"   "a:b:d"
> >>>> [13] "a:c:d"   "b:c:d"   "a:b:c:d"
> >>>>
> >>>> My eventual use is to fit models with the reduced formulas.
> >>>>
> >>>> For example:
> >>>>> my_df = data.frame( iv = runif(100), a=runif(100), b=runif(100),
> >>>> c=runif(100), d=runif(100))
> >>>>> lm(iv ~ a*b*c*d, data=my_df)
> >>>>
> >>>> I can remove particular terms with update(), but I don't see a way to
> >>>> remove all terms that contain a given combination of factors.
> >>>>
> >>>> Any help would be greatly appreciated.  Thanks!
> >>>>
> >>>> Allie
> >>>>
> >>>> ______________________________________________
> >>>> R-help at r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Bert Gunter
> >>> Genentech Nonclinical Biostatistics
> >>>
> >>> Internal Contact Info:
> >>> Phone: 467-7374
> >>> Website:
> >>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> >> biostatistics/pdb-ncb-home.htm
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> David Winsemius, MD
> >> Alameda, CA, USA
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Alameda, CA, USA




More information about the R-help mailing list