[R] Regressions with monotonicity constraints

Thomas Lumley tlumley at u.washington.edu
Tue Mar 13 00:19:22 CET 2001


On Mon, 12 Mar 2001, Vadim Ogranovich wrote:

> This seems to be a recurrent topic, but I don't remember hearing a
> definitive answer. I also apologies for cross-posting.
>
> Say I have a numerical response variable and a bunch of multi-level factors
> I want to use for modeling. I don't expect factor interaction to be
> important so there will be no interactions in the model.
> All this would be a perfect job for ANOVA except for one additional
> requirement: all factors are ordered, i.e. for any two levels in a factor
> one is bigger than the other, and I want the regression to preserve the
> ordering, that is to be monotonic.
>
> I can think of two definitions of monotonicity, weak and strong, and in case
> it matters I am more interested in the strong one, which I define as
> follows:
> If L1 < L2 within some factor F than
> prediction(L1, other factors) < prediction(L2, other factors)
> for ANY SELECTION of 'other factors' levels.
> (A weak monotonicity could be defined as prediction(L1) < prediction(L2) on
> average).

in the absence of interactions these are equivalent, surely?

One-dimensional monotonic regression is easy using the
Pool-Adjacent-Violators algorithm.  I think an additive monotonic model
could be estimated by backfitting as for generalised linear models: update
the function for each predictor in turn by using the one-dimensional
algorithm on the partial residuals from the previous iteration.
Backfitting is described in "Generalized Additive Models" by Hastie &
Tibshirani (Chapman & Hall) and probably other places.


The more difficult case is a non-additive model (ie interactions). For
discrete factors there were two papers in the Journal of Computational &
Graphical Statistics by Qian and colleagues. One of them was
@article{qian:eddy:1996,
    Author = {Qian, Shixian and Eddy, William F.},
    Title = {An Algorithm for Isotonic Regression on Ordered Rectangular
            Grids},
    Year = 1996,
    Journal = {Journal of Computational and Graphical Statistics},
    Volume = 5,
    Pages = {225--235},
    Keywords = {[Block class algorithm]; [Monotone regression]; [Recursive
               partitioning]; [Sandwich algorithm]}
}


> Any reference to S, R, C code / papers which address this topic will be
> highly appreciated.

The following does one-dimensional isotonic regression. I believe similar
code has been posted before. It is recursive but seems fast enough for
many uses. It returns a compressed form of the result where the first
component has the distinct fitted values and the second indicates the
number of repeats.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



 pava.blocks<-function(x,w=rep(1,length(x)),b=rep(1,length(x)),up=T){
	lasti<-1
	if (length(x)==1) return(list(x=x,blocks=b,increasing=up))
	for (i in 2:length(x)){
		if (x[i]<=x[lasti] & up){
		   wtotal<-w[lasti]+w[i]
		   x[lasti]<-(x[lasti]*w[lasti]+x[i]*w[i])/wtotal
		   w[lasti]<-wtotal
		   b[lasti]<-b[i]+b[lasti]
		   b[i]<-0
		 }else if (x[i]<=x[lasti] & !up){
		   lasti<-i
		 }else if (x[i]>x[lasti] & !up){
		   wtotal<-w[lasti]+w[i]
		   x[lasti]<-(x[lasti]*w[lasti]+x[i]*w[i])/wtotal
		   w[lasti]<-wtotal
		   b[lasti]<-b[i]+b[lasti]
		   b[i]<-0
		 }else if(x[i]>x[lasti] & up){
		   lasti<-i
		 }
		}
	 if (any(b==0)) {
		subset<-(b>0)
		return(pava.blocks(x[subset],w[subset],b[subset],up))
		}
	 else return(list(x=x,blocks=b,increasing=up))
}


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list