[R] rpart weight prior

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jul 9 13:27:07 CEST 2007


On Sun, 8 Jul 2007, Aurélie Davranche wrote:

> Hi!
>
> Could you please explain the difference between "prior" and "weight" in 
> rpart? It seems to be the same. But in this case why including a weight 
> option in the latest versions? For an unbalanced sampling what is the best to 
> use : weight, prior or the both together?

The 'weight' argument (sic) has been there for a decade, and is not the 
same as the 'prior' param.

The help file (which you seem unfamiliar with) says

  weights: optional case weights.

    parms: optional parameters for the splitting function. Anova
           splitting has no parameters. Poisson splitting has a single
           parameter, the coefficient of variation of the prior
           distribution on the rates.  The default value is 1.
           Exponential splitting has the same parameter as Poisson. For
           classification splitting, the list can contain any of: the
           vector of prior probabilities (component 'prior'), the loss
           matrix (component 'loss') or the splitting index (component
           'split').  The priors must be positive and sum to 1.  The
           loss matrix must have zeros on the diagonal and positive
           off-diagonal elements.  The splitting index can be 'gini' or
           'information'.  The default priors are proportional to the
           data counts, the losses default to 1, and the split defaults
           to 'gini'.

The rpart technical report at

http://mayoresearch.mayo.edu/mayo/research/biostat/upload/61.pdf

may help you understand this.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list