[Rd] Shouldn't vector indexing with negative out-of-range index give an error?

Martin Maechler maechler at lynne.stat.math.ethz.ch
Tue May 5 16:01:17 CEST 2015


>>>>> Henrik Bengtsson <henrik.bengtsson at ucsf.edu>
>>>>>     on Mon, 4 May 2015 12:20:44 -0700 writes:

    > In Section 'Indexing by vectors' of 'R Language Definition'
    > (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
    > it says:

    > "Integer. All elements of i must have the same sign. If they are
    > positive, the elements of x with those index numbers are selected. If
    > i contains negative elements, all elements except those indicated are
    > selected.

    > If i is positive and exceeds length(x) then the corresponding
    > selection is NA. A negative out of bounds value for i causes an error.

    > A special case is the zero index, which has null effects: x[0] is an
    > empty vector and otherwise including zeros among positive or negative
    > indices has the same effect as if they were omitted."

    > However, that "A negative out of bounds value for i causes an error"
    > in the second paragraph does not seem to apply.  Instead, R silently
    > ignore negative indices that are out of range.  For example:

    >> x <- 1:4
    >> x[-9L]
    > [1] 1 2 3 4
    >> x[-c(1:9)]
    > integer(0)
    >> x[-c(3:9)]
    > [1] 1 2

    >> y <- as.list(1:4)
    >> y[-c(1:9)]
    > list()

    > Is the observed non-error the correct behavior and therefore the
    > documentation is incorrect, or is it vice verse?  (...or is it me
    > missing something)

    > I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
    > (haven't check earlier versions).

Thank you, Henrik!

I've checked further back: The change happened between R 2.5.1 and R 2.6.0.

The previous behavior was

  > (1:3)[-(3:5)]
  Error: subscript out of bounds

If you start reading NEWS.2, you see a *lot* of new features
(and bug fixes) in the 2.6.0 news, but from my browsing, none of
them mentioned the new behavior as feature.

Let's -- for a moment -- declare it a bug in the code, i.e., not
in the documentation:

- As 2.6.0  happened quite a while ago (Oct. 2007),  
  we could wonder how much R code will break if we fix the bug.

- Is the R package authors' community willing to do the necessary
  cleanup in their packages ?

---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 


Now, after reading the source code for a while, and looking at
the changes, I've found the log entry

------------------------------------------------------------------------
r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines

Changed the behaviour of out-of-bounds negative
subscripts to match that of S.  Such values are
now ignored rather than tripping an error.

------------------------------------------------------------------------

So, it was changed on purpose, by one of the true "R"s, very
much on purpose.

Making it a *warning* instead of the original error
may have been both more cautious and more helpful for
detecting programming errors.

OTOH, John Chambers, the father of S and hence grandfather of R,
may have had good reasons why it seemed more logical to silently
ignore such out of bound negative indices:
One could argue that

   x[-5]  means  "leave away the 5-th element of x"

and if there is no 5-th element of x, leaving it away should be a no-op.

After all this musing and history detection, my gut decision
would be to only change the documentation which Ross forgot to change.

But of course, it may be interesting to hear other programmeR's feedback on this.

Martin



More information about the R-devel mailing list