[R] Seeking a more efficient way to find partition maxima

Mon Jan 7 19:30:37 CET 2008

Talbot,

Try this:

PartMax <- function(x, breaks)
{
   breaks <- c(breaks, length(x) + 1)
   sapply(seq(length(breaks) - 1),
          function(i) max(x[breaks[i]:(breaks[i + 1] - 1)],
                          na.rm = TRUE))
}

 > PartMax(c(1,4,2,6,7,5),c(1,4,5))
[1] 4 6 7

 > PartMax(6:1,c(1,4,5))
[1] 6 3 2

I would not go to extremes in trying to avoid *apply functions. In many 
cases, it will provide sufficient performance and enable the code to be 
quite readable.  Short of coding a new R function in a lower level 
language like C, you are not likely to get more performance and then you 
have to balance the time it takes to improve the code, versus simply 
getting the job done reliably and moving on.

If you have very large initial vectors, with a large number of 
sub-partitions and will be doing this often, then it might make sense to 
spend the time to profile the code and fine tune it. But I would first 
get a sense of the actual running time on larger vectors before making 
that decision.

HTH,

Marc

Talbot Katz wrote:
> Hi Marc.
> 
> Thank you for the swift response! I should have explained more about
the partitions, I hoped it would be clear from the code. I am supplying
an index vector to create the partitions. In my original example of v =
c(1,4,2,6,7,5) with v1 = v[1:3], v2 = v[4], v3 = v[5:6], I would specify
this partition with the index vector c(1,4,5), giving the first indices
of each subvector; the implicit assumption is that the last index of
partition i is the first index of partition (i + 1) minus 1, except for
the last partition, which ends at the end of the original vector being
partitioned. Here are the results of this example from the two functions
I defined:
> 
>> partiCmax(c(1,4,2,6,7,5),c(1,4,5))
> [1] 4 6 7
>> partiMax(c(1,4,2,6,7,5),c(1,4,5))
> [1] 4 6 7
> 
> 
> However, if I use an example in which the maxima are not
> non-decreasing:
> 
>> partiCmax(6:1,c(1,4,5))
> [1] 6 6 6
>> partiMax(6:1,c(1,4,5))
> [1] 6 3 2
> 
> partiMax gives the sought result, but partiCmax doesn't. Your
> function
>is like a cleaner version of partiMax, using the fact that you have the
>partition in a list. This begs the question of what is the most
>efficient way to create the partition list from the original vector and
>the index vector that specifies the partition. But I am also wondering
>whether we can find something even more efficient than the use of the
>apply family of functions. Thanks!

<snip>