[R] Smoothing Techniques - short stepwise functions with spikes

Liaw, Andy andy_liaw at merck.com
Tue May 11 14:10:54 CEST 2010


I'm surprised no one except Ralf mentioned tree-based methods.  Basic
regression trees are fitting exactly the type of functions (piecewise
constant) that Ralf is asking about.  So, either tree() or rpart() or
whatever is in party should fit the bill.

Another possibility is wavelets with the Haar basis.

(These will all preserve the piecewise constant nature of the problem,
while general smoothing procedures such as local regression and splines
assume there are no jumps in the underlying smooth function.)

Andy 

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Ralf B
> Sent: Tuesday, May 11, 2010 3:17 AM
> To: r-help at r-project.org
> Subject: [R] Smoothing Techniques - short stepwise functions 
> with spikes
> 
> R Friends,
> 
> I have data from which I would like to learn a more general
> (smoothened) trend by applying data smoothing methods. Data 
> points follow a positive stepwise function.
> 
> 
> |                                    x
>                      x
> |                      xxxxxxxx xxxxxxxx
> |       x    x
> |xxxx xxx xxxx
> |                                                   xxxxxxxxxxxxxxxxx
> |
> |
>           xxxxxxx xxxx
> |__________________________________________________________
> 
> 
> Data points from each step should not be interacting with any 
> other step. The outliers I want to to remove are spikes as 
> shown in the diagram. These spikes do not have more than one 
> or two points. I consider larger groups as relevant and want 
> to keep them in. I sometimes have less than 5 points for each 
> step, and up to 50 at max.
> Given these conditions would you suggest using one of the 
> moving averages (e.g. SMA, EMA, DEMA, ...) or the locally 
> linear regression
> (lowress) method. Are there any other options? Does anybody 
> know a good site that overviews all methods without going to 
> much into mathematical details but rather focusing on the 
> requirements and underlying assumptions of each method? Is 
> there perhaps even a package that runs and visualizes a 
> comparison on the data similar to packages like 'party' ? 
> (with 1000s of active packages, one can always hope for
> that)
> 
> Thanks in advance!
> Ralf
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}



More information about the R-help mailing list