[Rd] CRAN policies

Paul Gilbert pgilbert902 at gmail.com
Fri Mar 30 05:39:02 CEST 2012


On 12-03-29 09:29 PM, Mark.Bravington at csiro.au wrote:
 > I'm concerned this thread is heading the wrong way, towards
 > techno-fixes for imaginary problems. R package-building is already
 > encumbered with a huge set of complicated rules, and more
 > instructions/rules eg for metadata would make things worse not better.
 >
 > RCMD CHECK on the 'mvbutils' package generates over 300 Notes about
 > "no visible binding...", which inevitably I just ignore. They arise
 > because RCMD CHECK is too "stupid" to understand one of my preferred
 > coding idioms (I'm not going to explain what-- that's beside the
 > point).

Actually, I think that is the point. If your code is generating that 
many notes then I think you should explain your idiom, so the checks can 
be made to accommodate it if it really is good. Otherwise, I'd be 
worried about the quality of your code.

 > And RCMD CHECK always will be too "stupid" to understand everything
 > that a rich language like R might quite reasonably cause experienced
 > coders to do.

Possibly the interpreter is too stupid to understand it too?

 > It should not be CRAN's business how I write my code, or even whether
 > my code does what it is supposed to. It might be CRAN's business to
 > try to work out whether my code breaks CRAN's policies, eg by causing 
 > R to crash horribly-- that's presumably what Warnings are for (but
 > see below). And maybe there could be circumstances where an automatic
 > check might be "worried" enough to alert the CRANia and require manual
 > explanation and emails etc from a developer, but even that seems
 > doomed given the growing deluge of packages.
 >
 > RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as
 > a developer-tool. But the fact that the one programl does both things
 > seems accidental to me, and I think this dual-use is muddying the
 > discussion. There's a big distinction between (i) code-checks that
 > developers themselves might or might not find useful-- which should
 > be left to the developer, and will vary from person to person--

I think this a case of two heads are better than one. I did lots of
checks before the CRAN checks existed, but the CRAN checks still found 
bugs in code that I considerer very mature, including bugs in code has 
been running without noticeable problems for over 15 years. Despite all 
the noise today, most of us are only talking about a small inconvenience 
around the intended meaning of "note", not about whether quality control 
is a bad thing. I've found the errors and warnings are always valid, 
even though I do not always like having to fix the bugs, and the notes 
are most often valid too. But there are a few false positives, so the 
checks that give notes are not yet reliable enough to give warnings or 
errors. But they should be sometime, so one should usually consider 
fixing the package code.

 >   and (ii) code-checks that CRAN enforces for its own peace-of-mind.

I think of this as being for the piece-of-mind of your package users.

 > Maybe it's convenient to have both functions in the same place, and
 > it'd be fine to use Notes for one and Warnings for the other, but the
 > different purposes should surely be kept clear.
 >
 > Personally, in building over 10 packages (only 2 on CRAN), I haven't
 > found RCMD CHECK to be of any use, except for the code-documentation
 > and example-running bits. I know other people have different
 > opinions, but that's the point: one-size-does-not-fit-all when it
 > comes to coding tools.
 >
 > And wrto the Warnings themselves: I feel compelled to point out that
 > it's logically impossible to fully check whether R code will do bad
 > things. One has to wonder at what point adding new checks becomes
 > futile or counterproductive. There must be over 2000 people who have
 > written CRAN packages by now; every extra check and non-back-
 > compatible additional requirement runs the risk of generating false-
 > negatives and incurring many extra person-hours to "fix"
 > non-problems.
 > Plus someone needs to document and explain the check (adding to the
 > rule mountain), plus there is the time spent in discussions like
 > this..!

Bugs in your packages will require users to waste a lot of time too, and 
possibly reach faulty results with much more serious consequences. Just 
because perfection may never be attained, this does not mean that 
progress should not be attempted, in small steps. Compared to Statlib, 
which basicly followed your recommended approach, CRAN is a vast 
improvement.

Paul
 >
 > Mark
 >
 > Mark Bravington
 > CSIRO CMIS
 > Marine Lab
 > Hobart
 > Australia
 > ________________________________________
 > From:r-devel-bounces at r-project.org  [r-devel-bounces at r-project.org] 
On Behalf Of Hadley Wickham [hadley at rice.edu]
 > Sent: 30 March 2012 07:42
 > To: William Dunlap
 > Cc:r-devel at stat.math.ethz.ch; Spencer Graves
 > Subject: Re: [Rd] CRAN policies
 >
 >> Most of that stuff is already in codetools, at least when it is 
checking functions
 >> with checkUsage().  E.g., arguments of ~ are not checked.  The  expr 
argument
 >> to with() will not be checked if you add  skipWith=FALSE to the call 
to checkUsage.
 >>
 >>   >  library(codetools)
 >>
 >>   >  checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp 
~ Pred}))
 >>   <anonymous>: no visible binding for global variable 'Num' (:1)
 >>   <anonymous>: no visible binding for global variable 'Den' (:1)
 >>
 >>   >  checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp 
~ Pred}), skipWith=TRUE)
 >>
 >>   >  checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp 
~ Pred}), skipWith=TRUE)
 >>   <anonymous>: no visible binding for global variable 'DataFrame'
 >>
 >> The only part that I don't see is the mechanism to add code-walker 
functions to
 >> the environment in codetools that has the standard list of them for 
functions with
 >> nonstandard evaluation:
 >>   >  objects(codetools:::collectUsageHandlers, all=TRUE)
 >>    [1] "$"             "$<-"           ".Internal"
 >>    [4] "::"            ":::"           "@"
 >>    [7] "@<-"           "{"             "~"
 >>   [10] "<-"            "<<-"           "="
 >>   [13] "assign"        "binomial"      "bquote"
 >>   [16] "data"          "detach"        "expression"
 >>   [19] "for"           "function"      "Gamma"
 >>   [22] "gaussian"      "if"            "library"
 >>   [25] "local"         "poisson"       "quasi"
 >>   [28] "quasibinomial" "quasipoisson"  "quote"
 >>   [31] "Quote"         "require"       "substitute"
 >>   [34] "with"
 > It seems like we really need a standard way to add metadata to functions:
 >
 > attr(with, "special_args")<- "expr"
 > attr(lm, "special_args")<- c("formula", "weights", "subset")
 >
 > This would be useful because it could automatically contribute to the
 > documentation.
 >
 > Similarly,
 >
 > attr(my.new.method, "s3method")<- c("my.new", "method")
 >
 > could be useful.
 >
 > Hadley
 >
 >
 > --
 > Assistant Professor / Dobelman Family Junior Chair
 > Department of Statistics / Rice University
 > http://had.co.nz/
 >
 > ______________________________________________
 > R-devel at r-project.org  mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel
 >
 > ______________________________________________
 > R-devel at r-project.org  mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list