(PR#7824) [Rd] handling of zero and negative indices in

ripley at stats.ox.ac.uk ripley at stats.ox.ac.uk
Fri May 6 14:43:57 CEST 2005


I've put this in (with some different wording).  Although S blithely 
accepts mis-dimensioned index matrices I agree this is wrong and have made 
it an error.

On Fri, 29 Apr 2005 tplate at acm.org wrote:

> This message contains a description of what looks like a bug, examples
> of the suspect behavior, a proposed change to the C code to change this
> behavior, example of behavior with the fix, and suggestions for 3 places
> to update the documentation to reflect the proposed behavior.  It is
> submitted for consideration for inclusion in R.  Comments are requested.
>
> Currently, the code for subscripting by matrices checks that values in
> the matrix are not greater than the dimensions of the array being
> indexed.  However, it does not check for zero or negative indices, and
> blindly does index computation with them as though they were positive
> indices (including for negative indices whose absolute value exceeds the
> dimensions of the array being indexed).  Here are examples of indexing
> by matrices that do not return unequivocally sensible results (in most
> cases):
>
> > x <- matrix(1:6,ncol=2)
> > dim(x)
> [1] 3 2
> > x
>      [,1] [,2]
> [1,]    1    4
> [2,]    2    5
> [3,]    3    6
> > x[rbind(c(1,1), c(2,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,1))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,0))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(0,2))]
> [1] 1 5 3
> > x[rbind(c(1,1), c(2,2), c(0,3))]
> Error: subscript out of bounds
> > x[rbind(c(1,1), c(2,2), c(1,0))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(2,0))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(3,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(1,2))]
> [1] 1 5 4
> > x[rbind(c(1,1), c(2,2), c(-1,2))]
> [1] 1 5 2
> > x[rbind(c(1,1), c(2,2), c(-2,2))]
> [1] 1 5 1
> > x[rbind(c(1,1), c(2,2), c(-3,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(-4,2))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(-1,-1))]
> Error: subscript out of bounds
> >
> > # range checks are not applied to negative indices
> > x <- matrix(1:6, ncol=3)
> > dim(x)
> [1] 2 3
> > x[rbind(c(1,1), c(2,2), c(-3,3))]
> [1] 1 4 1
> > x[rbind(c(1,1), c(2,2), c(-4,3))]
> [1] 1 4
> >
> > version
>          _
> platform i386-pc-mingw32
> arch     i386
> os       mingw32
> system   i386, mingw32
> status
> major    2
> minor    1.0
> year     2005
> month    04
> day      18
> language R
> >
>
> The followed version of mat2indsub() (to replace the one in
> src/main/subscript.c) allows zero indices and explicitly disallows all
> negative indices:
>
> /* Special Matrix Subscripting: Handles the case x[i] where */
> /* x is an n-way array and i is a matrix with n columns. */
> /* This code returns a vector containing the integer subscripts */
> /* to be extracted when x is regarded as unravelled. */
> /* Negative indices are not allowed. */
> /* A zero anywhere in a row will cause a zero in the same */
> /* position in the result. */
>
> SEXP mat2indsub(SEXP dims, SEXP s)
> {
>     int tdim, j, i, k, nrs = nrows(s);
>     SEXP rvec;
>
>     PROTECT(rvec = allocVector(INTSXP, nrs));
>     s = coerceVector(s, INTSXP);
>     setIVector(INTEGER(rvec), nrs, 0);
>
>     for (i = 0; i < nrs; i++) {
> 	tdim = 1;
> 	/* compute 0-based subscripts for a row (0 in the input */
>         /* gets -1 in the output here) */
> 	for (j = 0; j < LENGTH(dims); j++) {
> 	    k = INTEGER(s)[i + j * nrs];
> 	    if(k == NA_INTEGER) {
> 		INTEGER(rvec)[i] = NA_INTEGER;
> 		break;
> 	    }
>             if(k < 0)
> 		error(_("cannot have negative values in matrices used as subscripts"));
> 	    if(k == 0) {
> 		INTEGER(rvec)[i] = -1;
> 		break;
> 	    }
> 	    if (k > INTEGER(dims)[j])
> 		error(_("subscript out of bounds"));
> 	    INTEGER(rvec)[i] += (k - 1) * tdim;
> 	    tdim *= INTEGER(dims)[j];
> 	}
> 	/* transform to 1 based subscripting (0 in the input */
>         /* gets 0 in the output here) */
> 	if(INTEGER(rvec)[i] != NA_INTEGER)
> 	    INTEGER(rvec)[i]++;
>     }
>     UNPROTECT(1);
>     return (rvec);
> }
>
> With this version, the above commands (+ a couple more) produce the
> following output (with commands that fail suitably wrapped in try() so
> that they can be included in a test files.)
>
> > x <- matrix(1:6,ncol=2)
> > dim(x)
> [1] 3 2
> > x
>      [,1] [,2]
> [1,]    1    4
> [2,]    2    5
> [3,]    3    6
> > x[rbind(c(1,1), c(2,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,1))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,3))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(1,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(2,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(3,0))]
> [1] 1 5
> > x[rbind(c(1,0), c(0,2), c(3,0))]
> numeric(0)
> > x[rbind(c(1,0), c(0,0), c(3,0))]
> numeric(0)
> > x[rbind(c(1,1), c(2,2), c(1,2))]
> [1] 1 5 4
> > x[rbind(c(1,1), c(2,NA), c(1,2))]
> [1]  1 NA  4
> > x[rbind(c(1,0), c(2,NA), c(1,2))]
> [1] NA  4
> > try(x[rbind(c(1,1), c(2,2), c(-1,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-1, 2))]) :
>         cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-2,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-2, 2))]) :
>         cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-3,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-3, 2))]) :
>         cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-4,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-4, 2))]) :
>         cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-1,-1))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-1, -1))]) :
>         cannot have negative values in matrices used as subscripts
> >
> > # verify that range checks are applied to negative indices
> > x <- matrix(1:6, ncol=3)
> > dim(x)
> [1] 2 3
> > try(x[rbind(c(1,1), c(2,2), c(-3,3))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-3, 3))]) :
>         cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-4,3))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-4, 3))]) :
>         cannot have negative values in matrices used as subscripts
> >
>
>
> The documentation page for ?Extract could have the following sentences
> added to the end of the 2nd para in the description of arguments "i, j,
> ...":
>
>   Negative indices are not allowed in matrices used as indices.
>   NA and zero values are allowed: rows containing a zero are
>   omitted from the result, and rows containing an NA produce an
>   NA in the result.
>
> In "Introduction to R", the same material could be added to the end of
> the section "Index Arrays" (in the chapter "Arrays and matrices").
>
> In "R Language Definition", the third paragraph of the section "Indexing
> matrices and arrays" (Section 3.4.2 in my copy) currently reads:
>
>> It is possible to use a matrix of integers as an index. In
>> this case, the number of columns of the matrix should match
>> the number of dimensions of the structure, and the result
>> will be a vector with length as the number of rows of the
>> matrix. The following example shows how to extract the
>> elements m[1, 1] and m[2, 2] in one operation.
>
> It could be changed to the following:
>
> It is possible to use a matrix of integers as an index.
> Negative indices are not allowed in matrices used as
> indices.  NA and zero values are allowed: rows containing a
> zero are omitted from the result, and rows containing an NA
> produce an NA in the result.  To use a matrix as an index,
> the number of columns of the matrix should match the number
> of dimensions of the structure, and the result will be a
> vector with length as the number of rows of the matrix
> (except when the matrix contains zeros). The following
> example shows how to extract the elements m[1, 1] and m[2,
> 2] in one operation.
>
>
> Additionally, indexing by matrix just goes ahead and uses matrix
> elements as vector indices in cases where the number of columns in the
> matrix does not match the number of dimensions in the array.  E.g.:
>
> > x <- matrix(1:6,ncol=2)
> > x[rbind(c(1,1,1), c(2,2,2))]
> [1] 1 2 1 2 1 2
> >
>
> Would it not be preferable behavior to stop with an error in such cases?
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list