[Rd] convolve: request for "usual" behaviour + some improvements + some fixes

Mon Feb 5 02:51:22 CET 2007

Hi Martin,

Thanks for taking the time to read me. Here is a followup,
it's rather long but hopefully not too long (and not too boring) ;-)

Quoting Martin Maechler <maechler at stat.math.ethz.ch>:

> Thank you, Herve,
> 
> >>>>> "Herve" == Herve Pages <hpages at fhcrc.org>
> >>>>>     on Fri, 02 Feb 2007 21:30:04 -0800 writes:
> 
>     Herve> Last but not least: convolve2 can be made 100 times or 1000 times
faster
>     Herve> than convolve by choosing a power of 2 for the length of the
fft-buffer
>     Herve> (a length of 2^n is the best case for the fft, the worst case
being
when
>     Herve> the length is a prime number):
...
> The typical approach here and definitely the idea of the
> original author of convolve() - would be to use  nextn()  here
> instead of  "next_power_of_2()".

The current implementation of convolve uses an fft-buffer of length
nx + ny - 1 for "open" convolution, not nextn().
The fft-based convolution is by nature "circular". However it can be
used for "open" convolutions: the trick is to padd the input sequences
with enough zeros to avoid the overlap inherent to circular convolution.
Then the ouput sequence doesn't "look" circular anymore.

For example, if the input sequences are
   X =   2   5   8
   Y = 100   1   0
then the "circular" convolution of X and Y is
   Z = 205 802 508
To achieve "open" convolution, it's enough to _right_ padd X and
Y with zeros:
   X =   2   5   8   0
   Y = 100   1   0   0
then the "circular" convolution doesn't look circular anymore (it
looks "open"):
   Z = 200 802 508   5

All you need to do is to padd enough zeros to have length X (and Y)
>= nx + ny - 1. But you can add more zeros if you want: this doesn't
change the result (except that you get extra zeros in the output
sequence):
  X =   2   5   8   0   0   0
  Y = 100   1   0   0   0   0
  Z = 200 802 508   5   0   0

The current approach of adding the strict minimum number of necessary
zeros could _at first glance_ seem reasonable since it minimizes
the amount of memory used. But the problem with this approach is that the
fft algorithm is not very good in the general case and very very bad when
the length of the sequence is a prime number: in this case it has to
perform n^2 complex multiplications so it can litterally take days or
weeks when n is a big prime number (>= 10 millions).

> 
> convolve() is one of the very old R functions stemming from
> Auckland already seen in the very first R tarball that's still
> available: Dated from June 20, 1995, with most file dates from
> Jun 16/17, i.e. really of even older date,  the file src/rrc/fft
> (no 'library', noR extension yet) contains definitions for  'fft', 'nextn'
> and 'convolve' where the latter was (note the ":=" for what now
> would be assignment to the base package)
> 
> convolve := function (a, b, conj=F)
> {
> 	na <- length(a)
> 	nb <- length(b)
> 	n <- max(na, nb)
> 	if (nb < n)
> 		b <- c(b, rep(0, n - nb))
> 	else if (na < n)
> 		a <- c(b, rep(0, n - na))
> 	da <- fft(a)
> 	db <- fft(b)
> 	if(conj) {
> 		a <- da$x * db$x + da$y * db$y
> 		b <- - da$x * db$y + db$x * da$y
> 	}
> 	else {
> 		a <- da$x * db$x - da$y * db$y
> 		b <- da$x * db$y + db$x * da$y
> 	}
> 	fft(a, b, inv=T)$x
> }
> 
> and just for historical fun here's the help file in that
> cute olde help file format:
> 
> TITLE(convolve @ Fast Convolution)
> USAGE(
> convolve(a, b, conj=false)
> )
> ARGUMENTS(
> ARG(a,b @ the sequences to be convolved.)
> ARG(conj @ logical, if true the transform of LANG(b) is conjugated
> before back-transformation.)
> )
> DESCRIPTION(
> LANG(convolve) uses the Fast Fourier Transform to compute the
> convolution of the sequences given as its arguments.
> PARA
> Complex conjugation is useful when computing
> autocovariances and autocorrelations by fast convolution.
> )
> REFERENCES(
> Brillinger, D. R. (1981).
> ITALIC(Time Series: Data Analysis and Theory), Second Edition.
> San Francisco: Holden-Day.
> )
> SEEALSO(
> LANG(LINK(fft)), LANG(LINK(nextn)).
> )

If you ignore the huge bug that 'a' is replaced by 'b' during the 0-padding
operation, this one was better than the current 'convolve'. It already had
the 'conj' argument and the default for it was FALSE so, by default,
the convolve function was doing convolution, not cross-correlation
(more on this below).
But it was doing "circular" convolution only...

> 
> Later I had added bits to the docu, convolve got the 'type'
> argument, Brian also fixed code and amplified the description and provided
> the alternative filter()  which had hencefor been recommended
> instead of convolve for most applications.

filter() might be OK for users who want filtering but what about people
who want a convolution?

Just FYI, with:
   X =   2   5   8
   Y = 100   1
filter(X, Y) gives:
   Z = 502 805  NA
convolve(X, Y, type="filter") gives:
   Z = 208 805
and convolve(x, y, conj=FALSE, type="filter") gives:
   Z = 200 502
so there are some inconsistencies.

But more important: what if I want the "usual" convolution product?

convolve(X, Y, type="o") gives:
   Z = 2 208 805 500
and convolve(x, y, conj=FALSE, type="o") is closer to what I expect
   Z = 5 200 802 508
but still not there :-/

IMO, having a function that does the correct convolution product is
very usefull e.g. for polynom multiplication (if you see X and Y as
the coefficients of polynoms Px and Py, then Z contains the coefficients
of polynomial product Px * Py), or for fast high precision arithmetic
(_exact_ multiplication of numbers with thousands of decimals).
An fft-based convolve function is _the_ tool for those operations
and of course, it shoul be fast.

> 
> For back-compatibility (and reverence to the very first R code
> writers (!?)) we did not change its behavior but rather documented it
> more precisely.
> 
> I haven't studied the details of all the possible padding
> options for a long time, but I remember that 0-padding
> (to length 'n' where 'n' is "highly composite") is only
> approximately the same as a non-padded version -- since the
> Fourier frequencies  2*pi*j/n depend on n.
> As you notice, padding is often very recommendable but it's
> strictly giving the solution to a different problem.

Nope, it gives exactly the same result (modulo floating point
rounding errors).

> For that reason, ?convolve has mentioned  nextn() for 12 years
> now, but not "urged" the padding

convolve() mentions nextn() but does not use it. The user has
no control over the length of the fft-buffer that is used.
But it shouldn't need to: you can safely assume that the best
choice is almost always to take the next_power_of_2 length.
Maybe there could be a few situations where you don't want to do this.
For example, if length(X) + length(Y) - 1 is 1031, the next power
of 2 is 2048 so you need to (almost) double the size of X and Y.
Maybe you are on a machine with very limited memory and you don't
want to do this. But in this case you'd better not try to use the
fft-based convolution at all because when n is a prime number (1031),
then it's better to compute the convolution by just doing:

   Z[i] = sum(k, X[k]*Y[i-k])

You'll do much less multiplications! and they will be on
doubles (or integers) when X and Y are double (or integer) vectors
(with the fft-base convolution, all multiplications are one complexes).

All this to say that it's only worth to use an fft-based convolution
if the length of the fft buffer is a power of 2. And for open
convolution you can _always_ do this. And the cost of it is nothing
compared to the gain in speed (use twice the memory to multiply the
speed by 100, 10000 or more!)

For example, compare the speed of 'filter' (not fft-based) with the
speed of 'convolve2':

  > x <- sample(1:9, 100000, replace=TRUE)
  > y <- rep(1:1, 10000)
  > system.time(z1 <- filter(x, y))
     user  system elapsed
   31.486   0.460  32.478
  > system.time(z2 <- convolve2(x, y, type="o"))
     user  system elapsed
    0.400   0.032   0.454

Note that z1 starts and ends with 4999 NAs. All the values between
(filtered signal?) are also in z2. To be precise:

  all.equal(z1[5000:95000], z2[10000:100000])

is TRUE.

> 
> If we would change convolve() to behave more like your proposal
> how many user-defined functions and scripts will give wrong
> answers if they are not amended?

With current version of convolve, they are getting wrong answers
anyway. Nobody on this list wants this

  http://www.sciencemag.org/cgi/content/summary/314/5807/1856

to happen just because of a buggy convolve function in R right?

> Probably only a very small fraction of all existing code (since
> convolve() is not in wide use), but that may still be 100's of
> cases. So that would need a the new version with a new name
> (such as "convolve2" or maybe slightly nicer "convolve.".
> 
> Is it worth introducing that and to start deprecating convolve() ?
> IIRC, the last time we considered this (several years ago), we
> had concluded "no", but maybe it's worth to get rid of this
> infelicity rather than to document it in ever more details.
> 

Here below is another 'convolve2' function (just in case). I've
reintroduced the 'conj' argument after I did some homework
and understood what it was really doing. _What_ it
does is make 'convolve' return the cross-correlation of
X and Y instead of the correlation.
_How_ it does this is by taking the conjugate of fft(y)
before to multiply it by fft(x). Without going into too many
details, I'll just say that this "trick" doesn't play well
with the type="o" option hence the special treatment of
y when 'type' is not "circular" and 'conj' is TRUE.
Also, I've reset the default to FALSE so by default convolve2
does convolution, not cross-correlation.

As for the "filter" type, it seems that the idea of its original
author was to truncate the result of the convolution but it's
not clear to me where this truncation should occur (to the right?
to the left?). IMO, the convolution product of X by Y _is_ the
result of X filtered by Y, it's just that one terminology comes
from maths and the other from signal processing. And there is nothing
to truncate because if the filter (Y) has a length >= 2, then it's
normal to expect a filtered signal longer than X so I would tend to
say that the "filter" feature is a broken concept.

Best,
H.

convolve2 <- function(x, y, conj=FALSE, type=c("circular", "open", "filter"))
{
    type <- match.arg(type)
    nx <- length(x)
    ny <- length(y)
    if (type == "circular") {
        fft_length <- max(nx, ny)
    } else {
        if (conj) {
            y <- rev(y)
            if (is.complex(y))
                y <- Conj(y)
            conj <- FALSE
        }
        nz <- nx + ny - 1
        fft_length <- 2^ceiling(log2(nz))
    }
    if (fft_length > nx)
        x[(nx+1):fft_length] <- as.integer(0)
    if (fft_length > ny)
        y[(ny+1):fft_length] <- as.integer(0)
    fy <- fft(y)
    if (conj)
        fy <- Conj(fy)
    z <- fft(fft(x) * fy, inverse=TRUE) / length(x)
    if (type == "open") {
        z <- z[1:nz]
    } else {
        if (type == "filter")
            z <- z[1:nx]
    }
    if (is.numeric(x) && is.numeric(y))
        z <- Re(z)
    if (is.integer(x) && is.integer(y))
        z <- as.integer(round(z))
    z
}