[Rd] convolve: request for "usual" behaviour + some improvements + some fixes

Martin Maechler maechler at stat.math.ethz.ch
Sat Feb 3 15:06:34 CET 2007


Thank you, Herve,

>>>>> "Herve" == Herve Pages <hpages at fhcrc.org>
>>>>>     on Fri, 02 Feb 2007 21:30:04 -0800 writes:

    Herve> Last but not least: convolve2 can be made 100 times or 1000 times faster
    Herve> than convolve by choosing a power of 2 for the length of the fft-buffer
    Herve> (a length of 2^n is the best case for the fft, the worst case being when
    Herve> the length is a prime number):

    >> x <- 1:100003
    >> y <- 1:1
    >> system.time(cc <- convolve(x, y, type="o")) # uses buffer length of 100003
    Herve> user  system elapsed
    Herve> 76.428   0.016  76.445
    >> system.time(cc <- convolve2(x, y, type="o")) # uses buffer length of 2^17
    Herve> user  system elapsed
    Herve> 0.164   0.012   0.179

The typical approach here and definitely the idea of the
original author of convolve() - would be to use  nextn()  here
instead of  "next_power_of_2()".

convolve() is one of the very old R functions stemming from
Auckland already seen in the very first R tarball that's still
available: Dated from June 20, 1995, with most file dates from
Jun 16/17, i.e. really of even older date,  the file src/rrc/fft
(no 'library', noR extension yet) contains definitions for  'fft', 'nextn'
and 'convolve' where the latter was (note the ":=" for what now
would be assignment to the base package)

convolve := function (a, b, conj=F)
{
	na <- length(a)
	nb <- length(b)
	n <- max(na, nb)
	if (nb < n)
		b <- c(b, rep(0, n - nb))
	else if (na < n)
		a <- c(b, rep(0, n - na))
	da <- fft(a)
	db <- fft(b)
	if(conj) {
		a <- da$x * db$x + da$y * db$y
		b <- - da$x * db$y + db$x * da$y
	}
	else {
		a <- da$x * db$x - da$y * db$y
		b <- da$x * db$y + db$x * da$y
	}
	fft(a, b, inv=T)$x
}

and just for historical fun here's the help file in that
cute olde help file format:

TITLE(convolve @ Fast Convolution)
USAGE(
convolve(a, b, conj=false)
)
ARGUMENTS(
ARG(a,b @ the sequences to be convolved.)
ARG(conj @ logical, if true the transform of LANG(b) is conjugated
before back-transformation.)
)
DESCRIPTION(
LANG(convolve) uses the Fast Fourier Transform to compute the
convolution of the sequences given as its arguments.
PARA
Complex conjugation is useful when computing
autocovariances and autocorrelations by fast convolution.
)
REFERENCES(
Brillinger, D. R. (1981).
ITALIC(Time Series: Data Analysis and Theory), Second Edition.
San Francisco: Holden-Day.
)
SEEALSO(
LANG(LINK(fft)), LANG(LINK(nextn)).
)

Later I had added bits to the docu, convolve got the 'type'
argument, Brian also fixed code and amplified the description and provided
the alternative filter()  which had hencefor been recommended
instead of convolve for most applications.

For back-compatibility (and reverence to the very first R code
writers (!?)) we did not change its behavior but rather documented it
more precisely.

I haven't studied the details of all the possible padding
options for a long time, but I remember that 0-padding
(to length 'n' where 'n' is "highly composite") is only
approximately the same as a non-padded version -- since the
Fourier frequencies  2*pi*j/n depend on n.
As you notice, padding is often very recommendable but it's
strictly giving the solution to a different problem.
For that reason, ?convolve has mentioned  nextn() for 12 years
now, but not "urged" the padding

If we would change convolve() to behave more like your proposal
how many user-defined functions and scripts will give wrong
answers if they are not amended?  
Probably only a very small fraction of all existing code (since
convolve() is not in wide use), but that may still be 100's of
cases. So that would need a the new version with a new name
(such as "convolve2" or maybe slightly nicer "convolve.".

Is it worth introducing that and to start deprecating convolve() ?
IIRC, the last time we considered this (several years ago), we
had concluded "no", but maybe it's worth to get rid of this
infelicity rather than to document it in ever more details.

Martin



    Herve> Here is the modified 'convolve2':

    Herve> convolve2 <- function(x, y, type = c("circular", "open", "filter"))
    Herve> {
    Herve> type <- match.arg(type)
    Herve> nx <- length(x)
    Herve> ny <- length(y)
    Herve> if (type == "circular") {
    Herve> nz <- max(nx, ny)
    Herve> } else {
    Herve> nz0 <- nx + ny - 1
    Herve> nz <- 2^ceiling(log2(nz0))
    Herve> }
    Herve> if (nz > nx)
    Herve> x[(nx+1):nz] <- as.integer(0)
    Herve> if (nz > ny)
    Herve> y[(ny+1):nz] <- as.integer(0)
    Herve> fz <- fft(x) * fft(y)
    Herve> z <- fft(fz, inverse=TRUE) / nz
    Herve> if (type == "open") {
    Herve> z <- z[1:nz0]
    Herve> } else {
    Herve> if (type == "filter")
    Herve> z <- z[1:nx]
    Herve> }
    Herve> if (is.numeric(x) && is.numeric(y))
    Herve> z <- Re(z)
    Herve> if (is.integer(x) && is.integer(y))
    Herve> z <- as.integer(round(z))
    Herve> z
    Herve> }

    Herve> In fact, it should try to be smarter than that and not use the fft at all
    Herve> when one of the 2 input sequences is very short (less than 3 or 4) or
    Herve> e.g. when one is 10000 times shorter than the other one.

    Herve> Cheers,
    Herve> H.


    Herve> Herve Pages wrote:
    >> Hi again,
    >> 
    >> There are many problems with current 'convolve' function.
    >> The author of the man page seems to be aware that 'convolve' does _not_ the
    >> usual thing:
    >> 
    >> Note that the usual definition of convolution of two sequences 'x'
    >> and 'y' is given by 'convolve(x, rev(y), type = "o")'.
    >> 
    >> and indeed, it does not:
    >> 
    >> > x <- 1:3
    >> > y <- 3:1
    >> > convolve(x, y, type="o")
    >> [1]  1  4 10 12  9
    >> 
    >> The "usual" convolution would rather give:
    >> 
    >> > convolve(x, rev(y), type="o")
    >> [1]  3  8 14  8  3
    >> 
    >> Also the "usual" convolution is commutative:
    >> 
    >> > convolve(y, rev(x), type="o")
    >> [1]  3  8 14  8  3
    >> 
    >> but convolve is not:
    >> 
    >> > convolve(y, x, type="o")
    >> [1]  9 12 10  4  1
    >> 
    >> Of course I could write the following wrapper:
    >> 
    >> usual_convolve <- function(x, y, ...) convolve(x, rev(y))
    >> 
    >> to work around those issues but 'convolve' has other problems:
    >> 
    >> (1) The input sequences shouldn't need to have the same length when
    >> type = "circular" (the shortest can be right-padded with 0s up
    >> to the length of the longest).
    >> (2) If the input sequences are both integer vectors, then the result
    >> should be an integer vector too.
    >> (3) The "filter" feature seems to be broken (it's not even clear
    >> what it should do or why we need it though):
    >> > x <- 1:9
    >> > y <- 1
    >> > convolve(x, y, type="f")
    >> Error in convolve(x, y, type = "f") : subscript out of bounds
    >> > convolve(y, x, type="f")
    >> numeric(0)
    >> (4) If you look at the source code, you'll see that 'x' is first left-padded
    >> with '0's. The "usual" convolution doesn't do that: it always padd
    >> sequences on the _same_ side (generally on the right).
    >> (5) It's not clear why we need a 'conj' arg. All what it does is
    >> take the conjugate of fft(y) before it does the product with fft(x).
    >> But this has the "non-usual" effect of reverting the expected result:
    >> > round(convolve(as.integer(c(0,0,0,1)), 1:7, type="o"))
    >> [1] 0 0 0 7 6 5 4 3 2 1
    >> 
    >> Here below is my version of 'convolve' just in case. It does the "usual"
    >> convolution plus:
    >> - no need to have 'x' and 'y' of the same length when 'type' is "circular",
    >> - when 'x' and 'y' are integer vectors, the output is an integer vector,
    >> - no more 'conj' arg (not needed, only leads to confusion),
    >> - when type is "filter", the output sequence is the same as with
    >> type="open" but is truncated to the length of 'x' (the original signal)
    >> It can be seen has the result of 'x' filtered by 'y' (the filter).
    >> 
    >> convolve2 <- function(x, y, type = c("circular", "open", "filter"))
    >> {
    >> type <- match.arg(type)
    >> nx <- length(x)
    >> ny <- length(y)
    >> if (type == "circular")
    >> nz <- max(nx, ny)
    >> else
    >> nz <- nx + ny - 1
    >> if (nz > nx)
    >> x[(nx+1):nz] <- as.integer(0)
    >> if (nz > ny)
    >> y[(ny+1):nz] <- as.integer(0)
    >> fx <- fft(x)
    >> fy <- fft(y)
    >> fz <- fx * fy
    >> z <- fft(fz, inverse=TRUE) / nz
    >> if (is.numeric(x) && is.numeric(y))
    >> z <- Re(z)
    >> if (is.integer(x) && is.integer(y))
    >> z <- as.integer(round(z))
    >> if (type == "filter")
    >> z[1:nx]
    >> else
    >> z
    >> }
    >> 
    >> Cheers,
    >> H.
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    Herve> ______________________________________________
    Herve> R-devel at r-project.org mailing list
    Herve> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list