[R] Clustering

David Winsemius dwinsemius at comcast.net
Thu Oct 28 21:25:34 CEST 2010


On Oct 28, 2010, at 8:00 AM, dpender wrote:

>
> I am looking to use R in order to determine the number of extreme  
> events for
> a high frequency (20 minutes) dataset of wave heights that spans 25  
> years
> (657,432) data points.
>
> I require the number, spacing and duration of the extreme events as an
> output.

If you created a "test" vector and then used rle on the "test",  you  
may get what you want.

This yields the intervals between "events" ( > greater than 0.9):

 > wave <- runif(100)
 > test <- wave > 0.9
 > rle(test)
Run Length Encoding
   lengths: int [1:11] 74 1 5 1 1 1 6 1 4 1 ...
   values : logi [1:11] FALSE TRUE FALSE TRUE FALSE TRUE ...
 > rle(test)$lengths[ !rle(test)$values ]
[1] 74  5  1  6  4  5

You can also get the "duration" of an extreme event by not using the  
negation of the values. (Sorry for the double-negative.)
-- 
David.

>
> I have briefly used the clusters function in evd package.
>
> Can anyone suggest a more appropriate package to use for such a large
> dataset?
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list