[Rd] Bounty on Error Checking

Matthew Dowle mdowle at mdowle.plus.com
Fri Jan 4 15:51:56 CET 2013


On 04.01.2013 14:03, Duncan Murdoch wrote:
> On 13-01-04 8:32 AM, Matthew Dowle wrote:
>>
>> On Fri, Jan 3, 2013, Bert Gunter wrote
>>> Well...
>>>
>>> On Thu, Jan 3, 2013 at 10:00 AM, ivo welch <ivo.welch <at>
>>> anderson.ucla.edu> wrote:
>>>>
>>>> Dear R developers---I just spent half a day debugging an R 
>>>> program,
>>>> which had two bugs---I selected the wrongly named variable, which
>>>> turns out to have been a scalar, which then happily multiplied as 
>>>> if
>>>> it was a matrix; and another wrongly named variable from a data
>>>> frame,
>>>> that triggered no error when used as a[["name"]] or a$name .  
>>>> there
>>>> should be an option to turn on that throws an error inside R when
>>>> one
>>>> does this.  I cannot imagine that there is much code that wants to
>>>> reference non-existing columns in data frames.
>>>
>>> But I can -- and do it all the time: To add a new variable, "d" to 
>>> a
>>> data frame, df,  containing only "a" and "b" (with 10 rows, say):
>>>
>>> df[["d"]] <- 1:10
>>
>> Yes but that's `[[<-`. Ivo was talking about `[[` and `$`; i.e., 
>> select
>> only not assign, if I understood correctly.
>>
>>>
>>> Trying to outguess documentation to create error triggers is a very
>>> bad idea.
>>
>> Why exactly is it a very bad idea? (I don't necessarily disagree, 
>> just
>> asking
>> for more colour.)
>>
>>> R already has plenty of debugging tools -- and there is even a 
>>> "debug"
>>> package. Perhaps you need a better programming editor/IDE. There 
>>> are
>>> several listed on CRAN, RStudio, etc.
>>
>> True, but that relies on you knowing there's a bug to hunt for. What 
>> if
>> you
>> don't know you're getting incorrect results, silently? In a similar 
>> way
>> that options(warn=2) turns known warnings into errors, to enable you 
>> to
>> be
>> more strict if you wish,
>
> I would say the point of options(warn=2) is rather to let you find
> the location of the warning more easily, because it will abort the
> evaluation.

True but as well as that, I sometimes like to run production systems 
with
options(warn=2). I'd prefer some tasks to halt at the slightest hint of
trouble than write a warning silently to a log file that may not be 
looked
at. I think of that as being more strict, more robust. Since 
option(warn=2)
is set even when there is no warning, to catch if one arises in future. 
Not
just to find it more easily once you know there is a warning.

> I would not recommend using code that issues warnings.

Not sure what you mean here.

>
> an option to turn on warnings from `[[` and
>> `$`
>> if the column is missing (select only, not assign) doesn't seem like 
>> a
>> bad option to have. Maybe it would reveal some previously silent 
>> bugs.
>
> I agree that this would sometimes be useful, but a very common
> convention is to do something like
>
> if (is.null(obj$element)) {  do something }
>
> These would all have to be re-written to something like
>
> if (missing.field(obj, "element") { do something }
>
> There are several hundred examples of the first usage in base R; I
> imagine thousands more in contributed packages.

Yes but Ivo doesn't seem to be writing that if() in his code. We're
only talking about an option that users can turn on for their own
code, iiuc. Not anything that would affect or break thousands of
packages. That's why I referred to the fact that all packages now
have namespaces, in the earlier post.

> I don't think the
> benefit of the change is worth all the work that would be necessary 
> to
> implement it.

It doesn't seem to be a lot of work. I already posted a working
straw man, for example, as a first step.

Matthew



More information about the R-devel mailing list