[R] setting sensitivity of r to errors

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Thu Mar 25 14:11:17 CET 2010


On Thu, Mar 25, 2010 at 12:17 PM, Jannis <bt_jannis at yahoo.de> wrote:
> Dear all,
>
> does anyone of you know how to increase Rs sensitivity to errors?

 There's a few things you can do with options(): "check.bounds" warns
you extending vectors 'accidentally', 'warn' lets you do stuff like
turn warnings into errors and stop the code. See ?options for more
info.

> I just migrated back from Matlab and really enjoyed there that Matlab just pops up with (really helpful!) error messages as soon as there is anything slightly wrong with my code. This is certainly anoying on the first run, but really helps to uncover some hidden bugs in the Code. Now I tried artificially to create errors in R to understand the try() function. I did not hardly manage to create one (surprisingly!). It would help if at least things like this would create errors:
> y[NA]
> y[200]
> y[0.5]
> 1/x[20]
> y[x]
>
> The last 4 lines did not produce any error, just NAs or empty arrays.
>
>
> Is there any way to change this?

 Not that I can see. The vector extraction function ("[") is pretty
primitive and low-level and lots of code uses it. You could write a
vector extraction function that did what you wanted and then called [,
something like:

getCheck = function(v,e){
 if(any(is.na(e))){
 stop("Tried to get NA values")
}
v[e]
}

> getCheck(x,4)
[1] 104
> getCheck(x,c(1,2,3))
[1] 101 102 103
> getCheck(x,c(1,NA,3))
Error in getCheck(x, c(1, NA, 3)) : Tried to get NA values

 but you'd have to replace every [ operation you have with a call to
getCheck. And you'd have to make sure getCheck doesn't have any bugs
in it.

 There's some horrible subtleties with R's vector indexing with NA
values. For example:

> x[c(3,4,NA,NA)]
[1] 103 104  NA  NA
> x[c(3,NA,NA,NA)]
[1] 103  NA  NA  NA
> x[c(NA,NA,NA,NA)]
 [1] NA NA NA NA NA NA NA NA NA NA

 - as soon as you have all NA in the indexing vector, you get back a
vector of NAs of the length of your input 'x' rather than the length
of indexing vector. The documentation of "[" says, under "NAs in
indexing"::

     When extracting, a numerical, logical or character ‘NA’ index
     picks an unknown element and so returns ‘NA’ in the corresponding
     element of a logical, integer, numeric, complex or character
     result, and ‘NULL’ for a list.  (It returns ‘00’ for a raw
     result.]

 which makes me think it's a bug, or at least an undocumented feature
in 2.10.1. (Also, I just noticed a bracket mismatch in that
paragraph!]}. Perhaps this is fixed in the latest.

> My problem is that I am running large loops over a huge set of timeseries that are so different in size and amount of NAs, that is is hard to figure out all possible errors beforehand (If I could do so, most probably I could already publish a paper about my series straight away :-) )

 It is probably a good idea to figure it all out beforehand, even if
it doesn't help your publication count!

 You might also note the na.rm options in sum() and similar operators.

Barry



More information about the R-help mailing list