Remy X.O. Martin vsxo at hotmail.com
Thu Jul 4 18:45:11 CEST 2002

Hi all,

I am trying to get to grips with rpart, and find it not very easy given the 
information that comes with the package. Contrary to e.g. the ctest package docs, it 
doesn't say when "an rpart" could be used, and/or how to interpret the results. Here  
are a few of the open questions I have:

1) Read in ?rpart: ...method: one of.... If y is a survival object... A similar 
fleeting reference is just above under 'na.action', and there is a y=TRUE argument to 
rpart itself. I *suppose* that this refers to a formula of the form y~x, with y being 
the dependent variable -- or is this a (minor) bug in the documentation?

2) It looks like rpart and aov are in a way complementary, and should to a certain 
degree give comparable results. Is there some "user's guide" document somewhere that 
describes this in language accessible to "generic scientists" (= non-statisticians)?

3) I just applied rpart to a dataset, and saw something that seems counter-intuitive 
at the least: a branch is made (n=84) between fac1=A (n=28) and fac1=B,C (n=56). The 
fac1=A branch is an end-node, the fac1=B,C branch is itself branched into fac1=C (n=
28) and (!) fac1=A,B (n=28)! According to the n-counts, there should be no more A in 
that latter branch (or in the fac1=B,C branch in general). Do I not understand 
something essential, or is this a bug (in the branch criterion label and/or the n-
count label)?

Thanks in advance!


