[R] using a variable for a column name in a formula

peter dalgaard pdalgd at gmail.com
Mon Oct 14 09:41:19 CEST 2013


On Oct 13, 2013, at 23:40 , Sarah Goslee wrote:

> This being R, there are likely other ways, but I use:
> 
> lm(as.formula(paste(nnn, "~ .")),data=X)
> 

That'll do for most purposes, but fortune("parse") applies. In particular: What happens if nnn is like "weight (kg)"?

I'd prefer to do a little computing on the language, as in

fml <- dummy ~ .
fml[[2]] <- as.name(nnn)
lm(fml, data=X)

[bquote() can be useful too, just beware that the result needs as.formula to become one, i.e. 

nnsym <- as.name(nnn)
fml <- as.formula(bquote(.(nnsym) ~ .))

It saves you the trouble of having to figure out how to index your way into the formula, which in the present case isn't really hard, but it will be if you need to substitute something in a complicated right hand side.]

-pd

> 
> Sarah
> 
> On Sun, Oct 13, 2013 at 5:04 PM, David Epstein
> <David.Epstein at warwick.ac.uk> wrote:
>> lm(height ~ ., data=X)
>> works fine.
>> 
>> However
>> nnn <- "height" ;  lm(nnn ~ . ,data=X)
>> fails
>> 
>> How do I write such a formula, which depends on the value of a string variable like nnn above?
>> 
>> A typical application might be a program that takes a data frame containing only numerical data, and figures out which of the columns can be best predicted from all the other columns.
>> 
>> Thanks
>> David
>> 
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list