[Rd] NOTE when detecting mismatch in output, and codes for NOTEs, WARNINGs and ERRORs

Thu Apr 10 10:34:51 CEST 2014

On 03/26/2014 06:46 PM, Paul Gilbert wrote:
>
>
> On 03/26/2014 04:58 AM, Kirill Müller wrote:
>> Dear list
>>
>>
>> It is possible to store expected output for tests and examples. From the
>> manual: "If tests has a subdirectory Examples containing a file
>> pkg-Ex.Rout.save, this is compared to the output file for running the
>> examples when the latter are checked." And, earlier (written in the
>> context of test output, but apparently applies here as well): "...,
>> these two are compared, with differences being reported but not causing
>> an error."
>>
>> I think a NOTE would be appropriate here, in order to be able to detect
>> this by only looking at the summary. Is there a reason for not flagging
>> differences here?
>
> The problem is that differences occur too often because this is a 
> comparison of characters in the output files (a diff). Any output that 
> is affected by locale, node name or Internet downloads, time, host, or 
> OS, is likely to cause a difference. Also, if you print results to a 
> high precision you will get differences on different systems, 
> depending on OS, 32 vs 64 bit, numerical libraries, etc. A better test 
> strategy when it is numerical results that you want to compare is to 
> do a numerical comparison and throw an error if the result is not 
> good, something like
>
>   r <- result from your function
>   rGood <- known good value
>   fuzz <- 1e-12  #tolerance
>
>   if (fuzz < max(abs(r - rGood))) stop('Test xxx failed.')
>
> It is more work to set up, but the maintenance will be less, 
> especially when you consider that your tests need to run on different 
> OSes on CRAN.
>
> You can also use try() and catch error codes if you want to check those.
>

Thanks for your input.

To me, this is a different kind of test, for which I'd rather use the 
facilities provided by the testthat package. Imagine a function that 
operates on, say, strings, vectors, or data frames, and that is expected 
to produce completely identical results on all platforms -- here, a 
character-by-character comparison of the output is appropriate, and I'd 
rather see a WARNING or ERROR if something fails.

Perhaps this functionality can be provided by external packages like 
roxygen and testthat: roxygen could create the "good" output (if asked 
for) and set up a testthat test that compares the example run with the 
"good" output. This would duplicate part of the work already done by 
base R; the duplication could be avoided if there was a way to specify 
the severity of a character-level difference between output and expected 
output, perhaps by means of an .Rout.cfg file in DCF format:

OnDifference: mute|note|warning|error
Normalize: [R expression]
Fuzziness: [number of different lines that are tolerated]

On that note: Is there a convenient way to create the .Rout.save files 
in base R? By "convenient" I mean a single function call, not checking 
and manually copying as suggested here: 
https://stat.ethz.ch/pipermail/r-help/2004-November/060310.html .

Cheers

Kirill