[R] trouble automating formula edits when log or * are present; update trouble

cberry at tajo.ucsd.edu cberry at tajo.ucsd.edu
Tue May 29 19:48:33 CEST 2012


Paul Johnson <pauljohn32 at gmail.com> writes:

> Greetings
>
> I want to take a fitted regression and replace all uses of a variable
> in a formula. For example, I'd like to take
>
> m1 <- lm(y ~ x1, data=dat)
>
> and replace x1 with something else, say x1c, so the formula would become
>
> m1 <- lm(y ~ x1c, data=dat)

So incantation involving substitute(), perhaps??

> frm <- y ~ log( x ) * ( w + u )
> do.call( substitute, list( frm, list( x = as.name("z") ) ) )
y ~ log(z) * (w + u)


HTH,

Chuck


>
> I have working code to finish that part of the problem, but it fails
> when the formula is more complicated. If the formula has log(x1) or
> x1:x2, the update code I'm testing doesn't get right.
>
> Here's the test code:
>
> ##PJ
> ## 2012-05-29
> dat <- data.frame(x1=rnorm(100,m=50), x2=rnorm(100,m=50),
> x3=rnorm(100,m=50), y=rnorm(100))
>
> m1 <- lm(y ~ log(x1) + x1 + sin(x2) + x2 + exp(x3), data=dat)
> m2 <- lm(y ~ log(x1) + x2*x3, data=dat)
>
> suffixX <- function(fmla, x, s){
>     upform <- as.formula(paste0(". ~ .", "-", x, "+", paste0(x, s)))
>     update.formula(fmla, upform)
> }
>
> newFmla <- formula(m2)
> newFmla
> suffixX(newFmla, "x2", "c")
> suffixX(newFmla, "x1", "c")
>
> The last few lines of the output. See how the update misses x1 inside
> log(x1) or in the interaction?
>
>
>> newFmla <- formula(m2)
>> newFmla
> y ~ log(x1) + x2 * x3
>> suffixX(newFmla, "x2", "c")
> y ~ log(x1) + x3 + x2c + x2:x3
>> suffixX(newFmla, "x1", "c")
> y ~ log(x1) + x2 + x3 + x1c + x2:x3
>
> It gets the target if the target is all by itself, but not otherwise.
>
> After messing with this for quite a while, I conclude that update was
> the wrong way to go because it is geared to replacement of individual
> bits, not editing all instances of a thing.
>
> So I started studying the structure of formula objects.  I noticed
> this really interesting thing. the newFmla object can be probed
> recursively to eventually reveal all of the individual pieces:
>
>
>> newFmla
> y ~ log(x1) + x2 * x3
>> newFmla[[3]]
> log(x1) + x2 * x3
>> newFmla[[3]][[2]]
> log(x1)
>> newFmla[[3]][[2]][[2]]
> x1
>
> So, if you could tell me of a general way to "walk" though a formula
> object, couldn't I use "gsub" or something like that to recognize each
> instance of "x1" and replace with "x1c"??
>
> I just can't figure how to automate the checking of each possible
> element in a formula, to get the right combination of [[]][[]][[]].
> See what I mean? I need to avoid this:
>
>> newFmla[[3]][[2]][[3]]
> Error in newFmla[[3]][[2]][[3]] : subscript out of bounds
>
> pj

-- 
Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at ucsd edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list