[Rd] stats::fft produces inconsistent results

Wed Oct 20 10:10:00 CEST 2021

Hello,

That sounds like a good diagnosis!
Indeed, R vectors are passed "by reference" to C code, but the semantic must be "by value", i.e. the C function must NOT change the contents of the vector, except in very specific cases.

A good program that has to work on a vector, must first duplicate the vector, unless the only reference to the vector is the reference inside the C function.
This can be tested by the MAYBE_REFERENCED() macro in Rinternals.h.

A good example can be found in the fft() function in src/library/stats/src/fourier.c in R source code:
    switch (TYPEOF(z)) {
    case INTSXP:
    case LGLSXP:
    case REALSXP:
        z = coerceVector(z, CPLXSXP);
        break;
    case CPLXSXP:
        if (MAYBE_REFERENCED(z)) z = duplicate(z);
        break;
    default:
        error(_("non-numeric argument"));
    }
    PROTECT(z);

This code coerces non-complex vectors to complex. Since this makes a copy, there is no need to duplicate.
Complex vectors are duplicated, unless they are not referenced by anything but the fft() function.

Now, the z vector can be modified "in place" without inconsistency.

Properly using R vectors in C code is tricky. You have to understand.
1) When you are allowed or not to modify vectors
2) When to PROTECT()vectors
3) How the garbage collector works and when it can trigger (answer : basically, when you call any internal R function)

Chapter 5 of "Writing R Extensions" documentation is quite extensive:
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Handling-R-objects-in-C

-- 
Sincerely
André GILLIBERT

-----Message d'origine-----
De : R-devel <r-devel-bounces using r-project.org> De la part de Ben Bolker
Envoyé : mercredi 20 octobre 2021 03:27
À : r-devel using r-project.org
Objet : Re: [Rd] stats::fft produces inconsistent results

   This is a long shot, but here's a plausible scenario:

   as part of its pipeline, ravetools::mvfftw computes the mean of the
input vector **and then centers it to a mean of zero** (intentionally or
accidentally?)

   because variables are passed to compiled code by reference (someone
can feel free to correct my terminology), this means that the original
vector in R now has a mean of zero

   the first element of fft() is mean(x)*length(x), so if mean(x) has
been forced to zero, that would explain your issue.

   I don't know about the non-reproducibility part.

On 10/19/21 7:06 PM, Dipterix Wang wrote:
> Dear R-devel Team,
>
> I'm developing a neuroscience signal pipeline package in R (https://github.com/dipterix/ravetools) and I noticed a weird issue that failed my unit test.
>
> Basically I was trying to use `fftw3` library to implement fast multivariate fft function in C++. When I tried to compare my results with stats::fft, the test result showed the first element of **expected** (which was produced by stats::fft) was zero, which, I am pretty sure, is wrong, and I can confirm that my function produces correct results.
>
> However, somehow I couldn’t reproduce this issue on my personal computer (osx, M1, R4.1.1), the error simply went away.
>
> The catch is my function produced consistent and correct results but stats::fft was not. This does not mean `stats::fft` has bugs. Instead, I suspect there could be some weird interactions between my code and stats::fft at C/C++ level, but I couldn’t figure it out why.
>
> +++ Details:
>
> Here’s the code I used for the test:
>
> https://github.com/dipterix/ravetools/blob/4dc35d64763304aff869d92dddad38a7f2b30637/tests/testthat/test-fftw.R#L33-L41
>
> ————————Test code————————
> set.seed(1)
> x <- rnorm(1000)
> dim(x) <- c(100,10)
> a <- ravetools:::mvfftw_r2c(x, 0)
> c <- apply(x, 2, stats::fft)[1:51,]
> expect_equal(a, c)
> ————————————————————————
>
> Here are the tests that gave me the errors:
>
> The test logs on win-builder
> https://win-builder.r-project.org/07586ios8AbL/00check.log
>
> Test logs on GitHub
> https://github.com/dipterix/ravetools/runs/3944874310?check_suite_focus=true
>
>
> —————————————— Failed tests ——————————————
>   -- Failure (test-fftw.R:41:3): mvfftw_r2c --------------------------------------
>   `a` (`actual`) not equal to `c` (`expected`).
>
>   actual vs expected
>                                       [,1]                    [,2]                  [,3]                  [,4]                    ...
>   - actual[1, ]     10.8887367+ 0.0000000i  -3.7808077+ 0.0000000i   2.967354+ 0.000000i   5.160186+ 0.000000i ...
>   + expected[1, ]    0.0000000+ 0.0000000i  -3.7808077+ 0.0000000i   2.967354+ 0.000000i   5.160186+ 0.000000i...
>
> ————————————————————————
>
> The first columns are different, `actual` is the results I produced via `ravetools:::mvfftw_r2c`, and `expected` was produced by `stats::fft`
>
>
> Any help or attention is very much appreciated.
> Thanks,
> - Zhengjia
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
Graduate chair, Mathematics & Statistics

______________________________________________
R-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel