[R] finding peaks in a simple dataset with R

Thu Nov 24 13:31:31 CET 2005

Hi Marc

I use this function for finding maxima in some spectral 
data (eg. from Xray diffraction) and it satisfied my 
needs. The function itself was modified probably due to 
some reasons for ploting my data so it dropped values 
from the end rather than from both sides.

Peaks in those cases are different than just occasional 
spikes from noise so therefore I did not notice this bug.
Thanks for your suggestion.

Best regards.

Petr

On 23 Nov 2005 at 14:33, Marc Kirchner wrote:

Date sent:      	Wed, 23 Nov 2005 14:33:28 +0000
From:           	Marc Kirchner <marc.kirchner at iwr.uni-heidelberg.de>
To:             	Martin Maechler <maechler at stat.math.ethz.ch>
Copies to:      	R-help at r-project.org
Subject:        	Re: [R] finding peaks in a simple dataset with R

> > 
> > I wonder if we shouldn't polish that a bit and add to R's
> > standard 'utils' package.
> > 
> 
> Hm, I figured out there are (at least) two versions out there, one
> being the "original" idea and a modification. 
> 
> === Petr Pikal in 2001 (based on Brian Ripley's idea)==
> peaks <- function(series, span=3) {
>  z <- embed(series, span)
>  result <- max.col(z) == 1 + span %/% 2
>  result
> }
> 
> versus
> 
> === Petr Pikal in 2004 ==
> peaks2<-function(series,span=3) {
>  z <- embed(series, span)
>  s <- span%/%2
>  v<- max.col(z) == 1 + s
>  result <- c(rep(FALSE,s),v)
>  result <- result[1:(length(result)-s)]
>  result
> } 
> 
> Comparison shows
> > peaks(c(1,4,1,1,6,1,5,1,1),3)
> [1]  TRUE FALSE FALSE  TRUE FALSE  TRUE FALSE
> which is a logical vector for elements 2:N-1 and
> 
> > peaks2(c(1,4,1,1,6,1,5,1,1),3)
> [1] FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE
> which is a logical vector for elements 1:N-2.
> 
> As I would expect to "lose" (span-1)/2 elements on each side 
> of the vector, to me the 2001 version feels more natural.
> 
> Also, both "suffer" from being non-deterministic in the 
> multiple-maxima-case (the two 4s here)
> 
> > peaks(c(1,4,4,1,6,1,5,1,1),3)
> [1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE
> > peaks(c(1,4,4,1,6,1,5,1,1),3)
> [1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
> > peaks(c(1,4,4,1,6,1,5,1,1),3)
> [1] FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE
> > peaks(c(1,4,4,1,6,1,5,1,1),3)
> [1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE
> 
> which also persits for span > 3 (without the 6 then, of course):
> 
> > peaks(c(1,4,4,1,1,1,5,1,1),5)
> [1]  TRUE FALSE FALSE FALSE  TRUE
> > peaks(c(1,4,4,1,1,1,5,1,1),5)
> [1] FALSE FALSE FALSE FALSE  TRUE
> > peaks(c(1,4,4,1,1,1,5,1,1),5)
> [1]  TRUE FALSE FALSE FALSE  TRUE
> 
> This could (should?) be fixed by modifying the call to max.col()
>  result <- max.col(z, "first") == 1 + span %/% 2;
> 
> Just my two cents,
> Marc
> 
> -- 
> ========================================================
> Dipl. Inform. Med. Marc Kirchner
> Interdisciplinary Centre for Scientific Computing (IWR)
> Multidimensional Image Processing
> INF 368
> University of Heidelberg
> D-69120 Heidelberg
> Tel: ++49-6221-54 87 97
> Fax: ++49-6221-54 88 50
> marc.kirchner at iwr.uni-heidelberg.de
> 
> 

Petr Pikal
petr.pikal at precheza.cz