[R] Fitting multiple horizontal lines to data

David Winsemius dwinsemius at comcast.net
Wed Nov 6 23:00:14 CET 2013


On Nov 6, 2013, at 9:19 AM, Sashikanth Chandrasekaran wrote:

> I am not trying to fit a horizontal line at every unique value of y. I am
> trying fit the y values with as few horizontal lines by trading off the
> number of horizontal lines with the error. The actual problem I am trying
> to solve is to smooth data in a time series. Here is a realistic example of
> y
> 
> y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99)
> 
> An example fit for y using multiple horizontal lines (may not be the best
> fit in terms of squared error or another error metric, but I have included
> the y value for concreteness)
> 

The human brain searches for patterns and often finds them where there is no underlying mechanism. If you are asking for a regime-change method that is statistically based and will replicate your brain-driven pencil-and-paper methods you will probably be disappointed.

> plot(y)
> abline(h=c(140,110,150,65) )
> abline(v=c(13,20,22,30) ,col="red")


> 1. A horizontal line at approximately y=140 (to fit the first 13 values -
> 134.45 to 148.77)
> 2. A horizontal line at approximately y=110 (to fit the next 7 values -
> 71.64 to 111.64)
> 3. A horizontal line at approximately y=150 (to fit the next 2 values -
> 157.26 to 143.8)
> 4. A horizontal line at approximately y=65 (to fit the last 8 values -
> 40.19 to 64.99)
> -sashi.

If you want a method that is driven by the magnitude of the shift in adjacent values, then this would find some but not all of your proposed breakpoints:

> which( abs(diff(y)) >55)
[1] 11 13 22 25 26

You could perhaps refine that set of candidates by requiring that the next value have some other defining feature but I was unable to come up with a simple rule-set that agreed with your candidates . There are packages that do segmented regression but hey are not generall set up to assume all regression coefficients are 0 and that you are only interested in


> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list