[R] Multiple Reponse CART Analysis

Bardwell, Jeff H Jeff_Bardwell at baylor.edu
Thu Dec 13 20:23:31 CET 2007

Thank you for the reply.  With the improved formula, mvpart worked like a charm.
Jeff H Bardwell, M.S.
Biology Department
ENV 1101 Lab Coordinator
Goebel 115, OH: Thu 1pm-4pm
710-6596 (e-mail preferred)


From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk]
Sent: Mon 12/10/2007 4:01 PM
To: Bardwell, Jeff H
Cc: r-help at r-project.org
Subject: Re: [R] Multiple Reponse CART Analysis

On Mon, 2007-12-10 at 14:17 -0600, Bardwell, Jeff H wrote:
> Dear R friends-
> I'm attempting to generate a regression tree with one gradient
> predictor and multiple responses, trying to test if change in size
> (turtle.data$Clength) acts as a single predictor of ten multiple diet
> taxa abundances (prey.data)  Neither rpart or mvpart seem to allow me
> to do multiple responses.  (Or if they can, I'm not using the
> functions properly.)
> > library(rpart)
> > turtle.rtree<-rpart(prey.data~., data=turtle.data$Clength,
> method="anova", maxsurrogate=0, minsplit=8, minbucket=4, xval=10);
> plot(turtle.rtree); text(turtle.rtree)
> Error in terms.formula(formula, data = data) :
>         '.' in formula and no 'data' argument

rpart doesn't do multiple responses - try package mvpart for a drop-in
replacement. Alternatively look at package party.

Also, you are not using formula correctly. What you should have written

prey.data ~ Clength, data = turtle.data

What R does is look for variables in the formula from within the data
argument, and IIRC data is supposed to be a data frame. If it doesn't
find what it needs in data, it looks in the workspace. This probably
isn't correct - the real answer having to do with environments and
parents etc., but effectively in this case this is what happens.

> When I switch response for predictor, it works.  But this is the
> opposite of what I wanted to test and gives me splits at abundance
> values, not carapace length values.
> > turtle.rtree<-rpart(turtle.data$Clength~., data=prey.data,
> method="anova", maxsurrogate=0, minsplit=8, minbucket=4, xval=10);
> plot(turtle.rtree); text(turtle.rtree)
> >

Of course, it has to expand . from data and prey.data is a data frame. R
picks up turtle.data$Clength from the workspace. But this isn't a
multivariate tree. You are confusing the problem above with not being
able to deal with multiple responses.

If mvpart is not working then you need to show why and how it fails.
>From your description, the response is abundances of prey species, this
should be fine in mvpart.

Note though, that mvpart seems to need a data.matrix as the response so
something like this should work:

data.matrix(prey.data) ~ Clength, data = turtle.data



> I've heard polymars recommended for this sort of situation.  I've
> downloaded the polyspline library, but get bogged down in the
> equation.  Also, it doesn't seem like polymars will generate a tree
> even if I do get it working.  Can rpart be modified in some way to
> accomodate multiple response parameters?  If anyone's ever come across
> this situation before, pointers would be much appreciated.  Thanks.
> Sincerely,
> Jeff Bardwell

 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk <http://www.freshwaters.org.uk/> 

More information about the R-help mailing list