[R] Unit Testing Frameworks: summary and brief discussion

Thu May 10 03:18:02 CEST 2007

Tony

Thanks for the summary.

My ad hoc system is pretty good for catching flagged errors, and 
numerical errors when I have a check.  Could you (or someone else) 
comment on how easy it would be with one of these more formal frameworks 
to do three things I have not been able to accomplish easily:

- My code gives error and warning messages in some situations. I want to 
test that the errors and warnings work, but these flags are the correct 
response to the test. In fact, it is an error if I don't get the flag. 
How easy is it to set up automatic tests to check warning and error 
messages work?

- For some things it is the printed format that matters. How easy is it 
to set up a test of the printed output? (Something like the Rout files 
used in R CMD check.) I think this is what Tony Plate is calling 
transcript file tests, and I guess it is not automatically available. I 
am not really interested in something I would have to change with each 
new release of R, and I need it to work cross-platform. I want to know 
when something has changed, in R or my own code, without having to 
examine the output carefully.

- (And now the hard one.) For some things it is the plotted output that 
matters. Is it possible to set up automatic tests of plotting? I can 
already test that plots run. I want to know if they "look very 
different". And no, I don't have a clue where to start on this one.

Paul Gilbert

anthony.rossini at novartis.com wrote:
> Greetings -
> 
> I'm finally finished review, here's what I heard:
> 
> ============ from Tobias Verbeke:
> 
> anthony.rossini at novartis.com wrote:
>> Greetings!
>>
>> After a quick look at current programming tools, especially with regards 
> 
>> to unit-testing frameworks, I've started looking at both "butler" and 
>> "RUnit".   I would be grateful to receieve real world development 
>> experience and opinions with either/both.    Please send to me directly 
>> (yes, this IS my work email), I will summarize (named or anonymous, as 
>> contributers desire) to the list.
>>
> I'm founding member of an R Competence Center at an international 
> consulting company delivering R services
> mainly to the financial and pharmaceutical industries. Unit testing is 
> central to our development methodology
> and we've been systematically using RUnit with great satisfaction, 
> mainly because of its simplicity. The
> presentation of test reports is basic, though. Experiences concerning 
> interaction with the RUnit developers
> are very positive: gentle and responsive people.
> 
> We've never used butler. I think it is not actively developed (even if 
> the developer is very active).
> 
> It should be said that many of our developers (including myself) have 
> backgrounds in statistics (more than in cs
> or software engineering) and are not always acquainted with the 
> functionality in other unit testing frameworks
> and the way they integrate in IDEs as is common in these other languages.
> 
> I'll soon be personally working with a JUnit guru and will take the 
> opportunity to benchmark RUnit/ESS/emacs against
> his toolkit (Eclipse with JUnit- and other plugins, working `in perfect 
> harmony' (his words)). Even if in my opinion the
> philosophy of test-driven development is much more important than the 
> tools used, it is useful to question them from
> time to time and your message reminded me of this... I'll keep you 
> posted if it interests you. Why not work out an
> evaluation grid / check list for unit testing frameworks ?
> 
> Totally unrelated to the former, it might be interesting to ask oneself 
> how ESS could be extended to ease unit testing:
> after refactoring a function some M-x ess-unit-test-function 
> automagically launches the unit-test for this particular
> function (based on the test function naming scheme), opens a *test 
> report* buffer etc.
> 
> Kind regards,
> Tobias
> 
> ============ from Tony Plate:
> 
> Hi, I've been looking at testing frameworks for R too, so I'm interested 
> to hear of your experiences & perspective.
> 
> Here's my own experiences & perspective:
> The requirements are:
> 
> (1) it should be very easy to construct and maintain tests
> (2) it should be easy to run tests, both automatically and manually
> (3) it should be simple to look at test results and know what went wrong 
> where
> 
> I've been using a homegrown testing framework for S-PLUS that is loosely 
> based on the R transcript style tests (run *.R and compare output with 
> *.Rout.save in 'tests' dir).  There are two differences between this 
> test framework and the standard R one:
> (1) the output to match and the input commands are generated from an 
> annotated transcript (annotations can switch some tests in or out 
> depending on the version used)
> (2) annotations can include text substitutions (regular expression 
> style) to be made on the output before attempting to match (this helps 
> make it easier to construct tests that will match across different 
> versions that might have minor cosmetic differences in how output is 
> formatted).
> 
> We use this test framework for both unit-style tests and system testing 
> (where multiple libraries interact and also call the database).
> One very nice aspect of this framework is that it is easy to construct 
> tests -- just cut and paste from a command window.  Many tests can be 
> generated very quickly this way (my impression is that is is much much 
> faster to build tests by cutting and pasting transcripts from a command 
> window than it is to build tests that use functions like all.equal() to 
> compare data structures.) It is also easy to maintain tests in the face 
> of change (e.g., with a new version of S-PLUS or with bug fixes to 
> functions or with changed database contents) -- I use ediff in emacs to 
> compare test output with the stored annotated transcript and can usually 
> just use ediff commands to update the transcript.
> 
> This has worked well for us and now we are looking at porting some code 
> to R.  I've not seen anything that offers these conveniences in R.
> 
> It wouldn't be too difficult to add these features to the built-in R 
> testing framework, but I've not had success in getting anyone in R core 
> to listen to even consider changes, so I've not pursued that route after 
> an initial offer of some simple patches to tests.mk and wintests.mk.
> 
> RUnit doesn't have transcript-style tests, but it wasn't very difficult 
> to add support for transcript-style tests to it.  I'll probably go ahead 
> and use some version of that for our porting project.  (And offer it to 
> the community if the RUnit maintainers want to incorporate it.)  I also 
> like the idea that RUnit has some code analysis tools -- that might 
> support some future project that allowed one to catalogue the number of 
> times each code path through a function was exercised by the tests.
> 
> I just looked at 'butler' and it looks very much like RUnit to me -- and 
> I didn't see any overview that explained differences.  Do you know of 
> any differences?
> 
> cheers,
> 
> Tony Plate
> 
> 
> ============== from Paul Gilbert:
> 
> Tony
> 
> While this is not exactly your question, I have been using my own system 
>   based on make and the tools use by R CMD build/check to do something I 
> think of as unit testing. This pre-dates the unit-testing frameworks, in 
> fact, some of it predates R. I actually wrote something on this at one 
> point: Paul Gilbert. R package maintenance. R News, 4(2):21-24, 
> September 2004.
> 
> I have occasionally thought about trying to use RUnit, but never done 
> much because I am relatively happy with what I have. (Inertia is an 
> issue too.)  I would be happy to hear if you do an assessment of the 
> various tools.
> 
> Best,
> Paul Gilbert
> 
> 
> ============= From Seth Falcon:
> 
> Hi Tony,
> 
> anthony.rossini at novartis.com writes:
>> After a quick look at current programming tools, especially with regards 
> 
>> to unit-testing frameworks, I've started looking at both "butler" and 
>> "RUnit".   I would be grateful to receieve real world development 
>> experience and opinions with either/both.    Please send to me directly 
>> (yes, this IS my work email), I will summarize (named or anonymous, as 
>> contributers desire) to the list.
> 
> I've been using RUnit and have been quite happy with it.  I had not
> heard of butler until I read your mail (!).
> 
> RUnit behaves reasonably similarly to other *Unit frameworks and this
> made it easy to get started with as I have used both JUnit and PyUnit
> (unittest module).
> 
> Two things to be wary of:
> 
>   1. At last check, you cannot create classes in unit test code and
>      this makes it difficult to test some types of functionality.  I'm
>      really not sure to what extent this is RUnit's fault as opposed
>      to limitation of the S4 implemenation in R.
> 
>   2. They have chosen a non-default RNG, but recent versions provide a
>      way to override this.  This provided for some difficult bug
>      hunting when unit tests behaved differently than hand-run code
>      even with set.seed().
> 
> The maintainer has been receptive to feedback and patches.  You can
> look at the not-so-beautiful scripts and such we are using if you look
> at inst/UnitTest in: Category, GOstats, Biobase, graph
> 
> Best Wishes,
> 
> + seth
> 
> 
> =================== Discussion:
> 
> After a bit more cursory use, it looks like RUnit is probably the right 
> approach at this time (sorry Hadley!).   Both RUnit and butler have a 
> range of testing facilities and programming support tools.   I support the 
> above statements about feasibility and problems -- except that I didn't 
> get a chance to checkout the S4 issues that Seth raised above.    The one 
> piece that I found missing in my version was some form of GUI based 
> tester, i.e. push a button and test, but I think I've not thought through 
> some of the issues with environments and closures yet that might cause 
> problems.
> 
> Thanks to everyone for responses!  It's clear that there is a good start 
> here, but lots of room for improvement exists.
> 
> Best regards / Mit freundlichen Grüssen, 
> Anthony (Tony) Rossini
> Novartis Pharma AG
> MODELING & SIMULATION
> Group Head a.i., EU Statistical Modeling
> CHBS, WSJ-027.1.012
> Novartis Pharma AG
> Lichtstrasse 35
> CH-4056 Basel
> Switzerland
> Phone: +41 61 324 4186
> Fax: +41 61 324 3039
> Cell: +41 79 367 4557
> Email : anthony.rossini at novartis.com
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 
> ------------------------------------------------------------------------
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
====================================================================================

La version française suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential inform...{{dropped}}