[R] Approximate Entropy?
Stephan Kolassa
Stephan.Kolassa at gmx.de
Tue Dec 23 19:45:47 CET 2008
Ben,
thanks a lot for that! I have a (reasonably) good idea about what ApEn
should be, and I'll try to understand your translation of Kaplan's
Matlab code.
Best,
Stephan
Ben Bolker schrieb:
>
>
> Stephan Kolassa wrote:
>> Dear guRus,
>>
>> is there a package that calculates the Approximate Entropy (ApEn) of a
>> time series?
>>
>> RSiteSearch only gave me a similar question in 2004, which appears not
>> to have been answered:
>> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/28830.html
>>
>> RSeek.org didn't yield any results at all.
>>
>> Happy holidays (where appropriate),
>> Stephan
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> I went ahead and translated D. Kaplan's MATLAB code for this purpose into R.
> Warning: it's barely tested, and I'm not sure I understand what it's doing
> -- you're on your own from here ...
> (it won't work if you simply cut and paste what's here, since the examples
> are above the definitions
> of the utility functions etc.)
>
>
> ## http://www.macalester.edu/~kaplan/hrv/doc/funs/apen.html
>
> ## Approximate Entropy
> ## Syntax
>
> ## entropy = apen( pre, post, r );
>
> ## Arguments
> ## pre An embedding of data.
> ## post The images of the data in the embedding.
> ## r The filter factor, which sets the length scale over which to compute
> the approximate entropy.
> ## Returned Values
> ## entropy The numerical value of the approximate entropy.
> ## Description
>
> ## The "approximate entropy" was introduced by Pincus to quantify the
> ## creation of information in a time series. A low value of the
> ## entropy indicates that the time series is deterministic; a high
> ## value indicates randomness.
>
> ## The "filter factor" r is an important parameter. In principle, with
> ## an infinite amount of data, it should approach zero. With finite
> ## amounts of data, or with measurement noise, it is not always clear
> ## what is the best value to choose. Past work on heart rate
> ## variability has suggested setting r to be 0.2 times the standard
> ## deviation of the data.
>
> ## Another important parameter is the "embedding dimension." Again,
> ## there is know precise means of knowing the best such dimension, but
> ## previous work has used a dimension of 2. The final parameter is the
> ## embedding lag, which is often set to 1, but perhaps more
> ## appropriately is set to be the smallest lag at which the
> ## autocorrelation function of the time series is close to zero.
>
> ## The apen function expects the data to be presented in a specific
> ## format. Working with a time series tseries, the following steps
> ## will compute the approximate entropy, with an embedding dimension
> ## of 2 and a lag of 1.
>
> ## edim = 2;
> ## lag = 1;
> ## edata = lagembed(tseries,edim,lag);
> ## [pre,post] = getimage(edata,lag);
> ## r = 0.2*std(tseries);
> ## apen(pre,post,r);
>
> edim <- 2
> lag <- 1
> edata <- lagembed(tseries,edim,lag)
> im <- getimage(edata,lag)
> r <- 0.2*sd(tseries)
> apen(im$pre,im$post,r)
>
> apenembed <- function(tseries,edim,lag=1,relr=0.2,r) {
> edata <- lagembed(tseries,edim,lag)
> im <- getimage(edata,lag)
> if (missing(r)) r <- relr*sd(tseries)
> apen(im$pre,im$post,r)
> }
>
> ## References
>
> ## * SM Pincus (1991) Proc. Natl. Acad. Sci. USA 88:2297-2301
> ## * D Kaplan, MI Furman, SM Pincus, SM Ryan, LA Lipsitz, AL Goldberger
> (1991) "Aging and the complexity of cardiovascular dynamics," Biophys.J.
> 59:945-949
>
> ## See Also
> ## apenhr. lagembed. getimage.
> ## Examples
>
>
> lagembed <- function(ts,d,lag) {
> z <- embed(ts,d)
> z[seq(1,nrow(z),by=lag),]
> }
>
> getimage <- function(x,pred) {
> pre <- x[1:(nrow(x)-pred),]
> post <- x[(pred+1):nrow(x),1]
> list(pre=pre,post=post)
> }
>
> ## tseries = randn(500,1);
> ## should have a large approximate entropy.
>
> set.seed(1001)
> apenembed(rnorm(500),edim=3)
> ## 0.3240493
>
> apenembed(sin(seq(1,100,by=0.2)),edim=3)
>
> ## should have a small approximate entropy since it is deterministic.
> ## 0.1001811
>
> ## examples from utils web page:
> ts <- 1:5
> x <- lagembed(ts,3,1)
> im <- getimage(x,1)
>
> apen <- function(pre,post,r) {
>
> ## translated from Kaplan D (1998) HRV software.
> ## http://www.macalester.edu/~kaplan/hrv/doc/Feb3snap.tar
> ## computer approximate entropy a la Steve Pincus
>
> N <- nrow(pre)
> p <- ncol(pre)
>
> ## number of pairs of points closer than r in pre/post space
> phiM <- phiMplus1 <- 0
> for (k in 1:N) {
> ## replicate the current point
> foo <- matrix(rep(pre[k,],N),byrow=TRUE,nrow=N)
> ## calculate distance (max norm)
> goo <- abs(foo-pre)<=r
> ## which ones of them are closer than r using the max norm?
> closerpre <- if (p==1) goo else apply(goo,1,all)
> precount <- sum(closerpre)
> phiM <- phiM+log(precount)
>
> ## of the ones that were closer in the pre space, how many are closer
> ## in post also ?
> postcount <- sum(abs(post[closerpre] - post[k])<r)
> phiMplus1 <- phiMplus1 + log(postcount)
> ## cat(k,precount,phiM,postcount,phiMplus1,"\n")
> }
> (phiM-phiMplus1)/N
> }
>
> ## also see:
> ## http://www.cbi.dongnocchi.it/glossary/ApEn.html
> ## http://www.physionet.org/physiotools/ApEn/
>
> ts <- rep(61:65,10)
> apenembed(ts,d=5,r=2) ## get 0 instead of 0.00189?
>
>
More information about the R-help
mailing list