[R] Extracting the terms from an rpart object

William Dunlap wdunlap at tibco.com
Wed Jan 26 19:21:05 CET 2011


Note that all.vars(terms(fit)) only looks at the
formula in the terms object and throws away all
the analysis done by rpart's call to terms(formula,data).
Here is a contrived example of that approach failing:

  > ageThreshold <- 50
  > fit3 <- rpart(Kyphosis=="present" ~ (Age>ageThreshold) + log(Number, base=2) + Start, data=kyphosis)
  > all.vars(terms(fit3))
  [1] "Kyphosis"     "Age"          "ageThreshold" "Number"       "Start"

Looking at the attributes of the terms object
tells you what I think you want:

  > attr(terms(fit3), "response") # 1=>there is a response variable, 0=>no response
  [1] 1
  > as.list(attr(terms(fit3), "variables"))[-1]
  [[1]]
  Kyphosis == "present"

  [[2]]
  Age > ageThreshold

  [[3]]
  log(Number, base = 2)

  [[4]]
  Start

rpart doesn't allow interaction terms (x:y), but if it did
you would want to look at the "factors" attribute to see
which items in the "variables" lists are in each term of
the expanded formula.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Tal Galili
> Sent: Wednesday, January 26, 2011 10:07 AM
> To: Henrique Dallazuanna
> Cc: r-help at r-project.org
> Subject: Re: [R] Extracting the terms from an rpart object
> 
> Thanks Henrique, exactly what I was looking for.
> 
> 
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il 
> (Hebrew) |
> www.r-statistics.com (English)
> --------------------------------------------------------------
> --------------------------------
> 
> 
> 
> 
> On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna 
> <wwwhsd at gmail.com>wrote:
> 
> > Try this:
> >
> > all.vars(terms(fit1))
> > all.vars(terms(fit2))
> >
> >
> > On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili 
> <tal.galili at gmail.com> wrote:
> >
> >> Hello all,
> >>
> >> I wish to extract the terms from an rpart object.
> >> Specifically, I would like to be able to know what is the response
> >> variable
> >> (so I could do some manipulation on it).
> >> But in general, such a method for rpart will also need to 
> handle a "."
> >> case
> >> (see fit2)
> >>
> >> Here are two simple examples:
> >>
> >> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
> >> fit1$call
> >> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
> >> fit2$call
> >>
> >>
> >> Is there anything "prettier" then using string manipulation?
> >>
> >>
> >> Thanks.
> >>
> >>
> >>
> >>
> >>
> >> ----------------Contact
> >> Details:-------------------------------------------------------
> >> Contact me: Tal.Galili at gmail.com |  972-52-7275845
> >> Read me: www.talgalili.com (Hebrew) | 
> www.biostatistics.co.il (Hebrew) |
> >> www.r-statistics.com (English)
> >>
> >> 
> --------------------------------------------------------------
> --------------------------------
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Henrique Dallazuanna
> > Curitiba-Paraná-Brasil
> > 25° 25' 40" S 49° 16' 22" O
> >
> 
> 	[[alternative HTML version deleted]]
> 
> 



More information about the R-help mailing list