[R] help speeding up simple Theil regression function

Brad Schneid bps0002 at auburn.edu
Sun Oct 21 20:40:23 CEST 2012


Obviously sort() is not needed in the following line:

 X <- median( sort( do.call(c, num) / do.call(c, dom) ) ) 







Brad Schneid wrote
> Hello,
> 
> I am working on a simple non-parametric (Theil) regression function and
> and am following Hollander and Wolfe 1999 text.  I would like some help
> making my function faster.  I have compared with pre-packaged version from
> "MBLM", which isnt very fast either, but it appears mine is faster with N
> = 1000 (see results below).  I plan on running this function repeatedly,
> and I generally have data lengths of ~ N = 6000 or more.  
> 
> # My function following Hollander and Wolfe text, Chapter 9
> np.lm <-function(dat, X, Y, ...){
> 	# Ch 9.2: Slope est. (X) for Thiel statistic
> 	combos <- combn(nrow(dat), 2)
> 	i.s <- combos[1,] 
> 	j.s <- combos[2,] 
> 	num <- vector("list", length=length(i.s))
> 	dom <- vector("list", length=length(i.s))
> 
> 		for(i in 1:length(i.s)){
> 			num[[i]]  <- dat[j.s[i],Y] - dat[i.s[i],Y]    
> 			dom[[i]]  <- dat[j.s[i],X] - dat[i.s[i],X]    
> 	        	 	}	
> 	
> 	X <- median( sort( do.call(c, num) / do.call(c, dom) ) )
> 	# Ch 9.4: Intercept est. for Thiel statistic
> 	Intercept <- median(dat[,"Y"] - X*dat[,"X"])
> 	out <- data.frame(Intercept, X)
> 	return(out)
> 		}   # usage: np.lm(dat, X=1, Y=2)
> ################################################################
> 
> library("mblm") # I will compare to mblm() function
> 
> X <- rnorm(1000)
> Y <- rnorm(1000)
> dat <- data.frame(X, Y)
> 
> system.time(np.lm(dat, X=1, Y=2) )
>    user  system elapsed 
> 118.610   0.130 119.144 
> 109.000   0.040 109.416 # ran it twice
>  86.190   0.100  86.589 # 3rd time
> 
> system.time( mblm(Y~X, dat, repeated=F) )
>     user   system  elapsed 
> 1509.200   87.670 1602.987 
> # not waiting on that to run again
> 
> OK, mine appears to be way faster... oddly, mblm() seemed faster with
> smaller (n=100) datasets
> 
> Can someone please tell me how they would improve the np.lm() function to
> make it quicker, or faster with some other tricks (parallel?.. Ive never
> done that.).
> 
> Thank you ahead of time for any help. 
> Brad
> 
> Ubuntu 10.04
> Intel i5 CPU 4*(650 @ 3.20GHz)
> 11.6 GiB memory
> R version 2.15.0 (2012-03-30)





--
View this message in context: http://r.789695.n4.nabble.com/help-speeding-up-simple-Theil-regression-function-tp4646923p4646926.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list