[Rd] Request: Suggestions for "good teaching" packages, esp. with C code

Paul Johnson pauljohn32 at gmail.com
Tue Feb 15 19:04:42 CET 2011


I am looking for CRAN packages that don't teach bad habits.  Can I
have suggestions?

I don't mean the recommended packages that come with R, I mean the
contributed ones.  I've been sampling a lot of examples and am
surprised that many ignore seemingly agreed-upon principles of R
coding. In r-devel, almost everyone seems to support the "functional
programming" theme in Chambers's book on Software For Data Analysis,
but when I go look at randomly selected packages, programmers don't
follow that advice.

In particular:

1. Functions must avoid "mystery variables from nowhere."

Consider a function's code, it should not be necessary to say "what's
variable X?" and go hunting in the commands that lead up to the
function call.  If X is used in the function, it should be in a named
argument, or extracted from one of the named arguments.  People who
rely on variables floating around in the user's environment are
creating hard-to-find bugs.

2. We don't want functions with indirect effects (no <<- ), almost always.

3. Code should be vectorized where possible, C style for loops over
vector members should be avoided.

4. We don't want gratuitous use of "return" at the end of functions.
Why do people still do that?

5. Neatness counts.  Code should look nice!  Check out how beautiful
the functions in MASS look! I want code with spaces and " <- " rather
than  everything jammed together with "=".

I don't mean to criticize any particular person's code in raising this
point.  For teaching exemples, where to focus?

Here's one candidate I've found:

MNP.  as far as I can tell, it meets the first 4 requirements.  And it
has some very clear C code with it as well. I'm only hesitant there
because I'm not entirely sure that a package's C code should introduce
its own functions for handling vectors and matrices, when some general
purpose library might be more desirable.  But that's a small point,
and clarity and completeness counts a great deal in my opinion.

Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

More information about the R-devel mailing list