[R] strange fluctuations in system.time with kernapply

Ravi Varadhan rvaradhan at jhmi.edu
Mon May 2 15:41:28 CEST 2011


Why not do `zero padding' to improve the efficiency, i.e. add a bunch of zeros to the end of the data vector such that the resulting vector is a power of 2?  This is very common in signal processing, and is legitimate since zero padding does not add any new information.

Ravi.

-------------------------------------------------------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Uwe Ligges
Sent: Monday, May 02, 2011 5:31 AM
To: Alexander Senger
Cc: r-help at r-project.org
Subject: Re: [R] strange fluctuations in system.time with kernapply



On 29.04.2011 23:38, Alexander Senger wrote:
> Hello expeRts,
>
>
> here is something which strikes me as kind of odd and I would like to
> ask for some enlightenment:
>
> First let's do this:
>
> tkern <- kernel("modified.daniell", c(5,5))
> test <- rep(1,1000000)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.100 0.040 1.136
>
> That was easy. Now this:
>
> test <- rep(1,1100000)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.40 0.02 1.43
>
> Still fine. Now this:
>
> test <- rep(1,1110000)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.390 0.020 1.409
>
> Ok, by now it seems boring. But wait:
>
> test <- rep(1,1110300)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 12.270 0.030 12.319
>
> There is a sudden - and repeatable! - jump in the time needed to execute
> kernapply. At least from a naive point of view there should not be much
> difference between applying a kernel to a vector 1110000 or 1110300
> entries long. But maybe there is some limit here?
>
> So I tried this:
>
> test <- rep(1,1110400)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.96 0.01 1.97
>
> which doesn't fit into the pattern. But the best thing is still to come.
> When I try this
>
> test <- rep(1,1110308)
> system.time(kernapply(test,tkern))
>
> then the computer starts to run and does so for longer than 15 minutes
> until when I normally kill the process. As noted above this behaviour is
> repeatable and occurs every time I issue these commands.
>
> I really would like to know if there is some magic to the number 1110308
> I'm not aware of.

The magic is that the length of the vector, 1110308, is inefficient for 
the fft() used within kernapply(). You need integer powers of 2 for a 
really fast FFT.

You can also try smaller numbers  to get longer runtimes, e.g.: 100003

As an example, compare:

system.time(fft(rep(1, 32768))) # roughly 0 seconds
system.time(fft(rep(1, 32771))) # almost 10 seconds

Uwe Ligges



>
>
> Last but not least, here is my
>
> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
> [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C
> [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8
> [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8
> [7] LC_PAPER=de_DE.utf8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.1
>
>
> Thank you,
>
> Alex
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list