[R] library(rpart) or library(tree)

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Dec 19 23:09:45 CET 2007


You appear to have fitted a regression tree, which does not seem to be 
what your interpretation of 'pnV22' requires.

I have little idea what you actually did, but am confident that it is not 
what you claim you did.

Also, note fortune("dog"):

Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'?
Anyway, it might clash with the function 'matrix'.
    -- Barry Rowlingson
       R-help (October 2004)

On Wed, 19 Dec 2007, Ingo Holz wrote:

> Hi,
>
> I have a problem with library (rpart) (and/or library(tree)).
>
> I use a data.frame with variables
> "pnV22" (observation: 1, 0 or yes, no)
> "JTemp" (mean temperature)
> "SNied"  (summer rain)
>
> I used function "rpart" to build a model:
>
> 	library(rpart)
> 	attach(data.frame)
> 	result <- rpart(pnV22 ~ JTemp + SNied)
>
> I got the following tree:

I don't believe that: how could rpart know about 'punkte'?

>  n=55518 (50 observations deleted due to missingness)
>
> node), split, n, deviance, yval
>      * denotes terminal node
>
> 1) root 55518 668.744500 0.0121942400
>   2) punkte[["JTemp"]]< 10.35 51251  18.992960 0.0003707245 *
>   3) punkte[["JTemp"]]>=10.35 4267 556.532000 0.1542067000
>     6) punkte[["SNied"]]>=450 3136 291.318600 0.1036352000 *
>     7) punkte[["SNied"]]< 450 1131 234.954900 0.2944297000
>      14) punkte[["JTemp"]]>=10.55 723 113.502100 0.1950207000 *
>      15) punkte[["JTemp"]]< 10.55 408 101.647100 0.4705882000
>        30) punkte[["JTemp"]]< 10.45 48   4.479167 0.1041667000 *
>        31) punkte[["JTemp"]]>=10.45 360  89.863890 0.5194444000 *
>
> I constructed a simple new.data.frame:
>
>     new.data.fame <- data.frame
>     new.data.frame[,"JTemp"] <- 10.5
>     new.data.frame[,"SNied"] <- 430
>
> Than I used predict() to predict values for "pnV22" in the following way:
>
>    pred <- predict(result, data.frame)
>    pred2 <- predict(result, new.data.frame)

It is not finding the new values from the new data frame: they do not have 
names like 'punkte[["JTemp"]]'.

> The results are the same, which I checked by ploting the values of pred and pred2 and by
>
>   table(pred ==pred2)  which is true for all values.
>
> Looking at the tree I would expect that pred2 has the same high value for all elements of the
> vector. Did I make a mistake?
>
> Thanks, Ingo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list