[R] Suggestion about "R equivalent of Splus peaks() function"

Earl F. Glynn efg at stowers-institute.org
Thu Feb 8 23:34:31 CET 2007


In 2004 there was this R-Help posting from Jan 2004:

    http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html
    R equivalent of Splus peaks() function?

The peaks function there has worked well for me on a couple of projects, but 
some code using "peaks" failed today, which had worked fine in the past.

I was looking for a peak in a test case that was a sine curve over one 
cycle, so there should have been only one peak.  My unexpected surprise was 
to sometimes get one peak, or two adjoining peaks (a tie), but the no peaks 
case cause subsequent code to fail.  I wanted to eliminate this "no peak" 
case when there was an obvious peak.

I thought it was odd that the peak failure could be controlled by the random 
number seed.

# R equivalent of Splus peaks() function
# http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html

peaks <- function(series,span=3)
{
  z <- embed(series, span)
  s <- span%/%2
  v <- max.col(z) == 1 + s
  result <- c(rep(FALSE,s),v)
  result <- result[1:(length(result)-s)]
  result
}

> set.seed(19)
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE  TRUE  TRUE FALSE  TRUE FALSE  TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE
> peaks(c(1,4,4,1,6,1,5,1,1),3)
[1] FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE


Above, the "4" peak at positions 2 and 3 is shown by the TRUE and FALSE in 
positions 2 and 3 above.  Case 4 of FALSE, FALSE was most unexpected -- no 
peak.


I studied the peaks code and found the problem seems to be in max.col:
> z
     [,1] [,2] [,3]
[1,]    4    4    1
[2,]    1    4    4
[3,]    6    1    4
[4,]    1    6    1
[5,]    5    1    6
[6,]    1    5    1
[7,]    1    1    5

> max.col(z)
[1] 2 3 1 2 3 2 3
> max.col(z)
[1] 2 2 1 2 3 2 3
> max.col(z)
[1] 1 2 1 2 3 2 3
> max.col(z)
[1] 2 2 1 2 3 2 3
> max.col(z)
[1] 1 3 1 2 3 2 3
> max.col(z)
[1] 2 2 1 2 3 2 3

The ?max.col help shows that it has a ties.method that defaults to "random". 
I want a peak, any peak if there is a tie, but I don't want the case that a 
tie is treated as "no peak".  For now, I added a "first" parameter to 
max.col in peaks:

# Break ties by using "first"

peaks <- function(series,span=3)
{
  z <- embed(series, span)
  s <- span%/%2
  v <- max.col(z, "first") == 1 + s
  result <- c(rep(FALSE,s),v)
  result <- result[1:(length(result)-s)]
  result
}

A better solution might be a ties.method parameter to peaks, which can be 
passed to max.col.

I did all of this in R 2.4.1, but the problem seems to be in earlier 
versions too.

Just in case anyone else is using this "peaks" function.

efg

Earl F. Glynn
Stowers Institute for Medical Research



More information about the R-help mailing list