[R] Automatically Remove Aliased Terms from a Model

Thaler,Thorn,LAUSANNE,Applied Mathematics Thorn.Thaler at rdls.nestle.com
Thu Oct 31 09:16:26 CET 2013


Dear Eik,

Thanks for your answer. I think indeed I was not to clear of what I want to achieve. So let me rephrase:

In case that there are aliased predictors in my model, I will see them via the alias function:

d <- expand.grid(a = 0:1, b=0:1)
d$c <- (d$a + d$b)  %% 2
d$y <- rnorm(4)
d <- within(d, {a <- factor(a); b <- factor(b); c <- factor(c)})
l <- lm(y ~ a * b + c, d)
alias(l)
Model :

y ~ a * b + c

Complete :
      (Intercept) a1   b1   c1  
a1:b1    0         1/2  1/2 -1/2

I see from alias that the _coefficient_ a1:b1 is aliased and can thus not be estimated. Hence, I want to remove the _term_ a:b from the model. While this is a straight forward thing if we were talking about continuous predictors, it is not that easy to do so if we have aliased quantitative predictors (i.e. factors), because a _term_ like a:b (i.e. an interaction between two 2-level factors) will yield 1 _coefficient_ a1:b1. But in order to use the update function I have to translate the output of aliased (which gives aliased _coefficients_) back to _terms_. Something like update(l, . ~ . - a1:b1) would not work for apparent reasons and I have to use update(l, . ~ . - a:b), which means I have to translate a1:b1 somehow to a:b.

Overall, I want to remove any continuous/quantitative predictor if any associated coefficient  cannot be estimated.

Is that clearer now?


KR,

-Thorn

-----Original Message-----
From: Eik Vettorazzi [mailto:E.Vettorazzi at uke.de] 
Sent: Dienstag, 29. Oktober 2013 21:47
To: Thaler,Thorn,LAUSANNE,Applied Mathematics; R-Help Mailing List (r-help at r-project.org)
Subject: Re: [R] Automatically Remove Aliased Terms from a Model

Hi Thorn,
it is not entirely clear (at least for me) what you want to accomplish.
an easy and fail safe way of extracting used terms in a (g)lm-object is
names(model.frame(l))
if you want to extract terms to finally select a model, have a look at
drop1 and/or MASS::dropterm

Hth

Am 28.10.2013 17:19, schrieb Thaler,Thorn,LAUSANNE,Applied Mathematics:
> Dear all,
> 
> I am trying to implement a function which removes aliased terms from a model. The challenge I am facing is that with "alias" I get the aliased coefficients of the model, which I have to translate into the terms from the model formula. What I have tried so far:
> 
> ------------------8<------------------
> d <- expand.grid(a = 0:1, b=0:1)
> d$c <- (d$a + d$b)  %% 2
> d$y <- rnorm(4)
> d <- within(d, {a <- factor(a); b <- factor(b); c <- factor(c)}) l <- 
> lm(y ~ a * b + c, d)
> 
> removeAliased <- function(mod) {
>   ## Retrieve all terms in the model
>   X <- attr(mod$terms, "term.label")
>   ## Get the aliased coefficients  
>   rn <- rownames(alias(mod)$Complete)
>   ## remove factor levels from coefficient names to retrieve the terms
>   regex.base <- unique(unlist(lapply(mod$model[, sapply(mod$model, is.factor)], levels)))
>   aliased <- gsub(paste(regex.base, "$", sep = "", collapse = "|"),  "", gsub(paste(regex.base, ":", sep = "", collapse = "|"), ":", rn))
>   uF <- formula(paste(". ~ .", paste(aliased, collapse = "-"), sep = "-"))
>   update(mod, uF)
> }
> 
> removeAliased(l)
> ------------------>8------------------
> 
> This function works in principle, but this workaround with removing the factor levels is just, well, a workaround which could cause problems in some circumstances (when the name of a level matches the end of another variable, when I use a different contrast and R names the coefficients differently etc. - and I am not sure which other cases I am overlooking).
> 
> So my question is whether there are some more intelligent ways of doing what I want to achieve? Is there a function to translate a coefficient of a LM back to the term, something like:
> 
> termFromCoef("a1") ## a1
> termFromCoef("a1:b1") ## a:b
> 
> With this I could simply translate the rownames from alias into the terms needed for the model update.
> 
> Thanks for your help.
> 
> Kind Regards,
> 
> Thorn Thaler
> NRC Lausanne
> Applied Mathematics
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
Eik Vettorazzi

Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790
--

Besuchen Sie uns auf: www.uke.de
_____________________________________________________________________

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg
Vorstandsmitglieder: Prof. Dr. Martin Zeitz (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Rainer Schoppik _____________________________________________________________________

SAVE PAPER - THINK BEFORE PRINTING



More information about the R-help mailing list