[R] sweep() and recycling

Heather Turner Heather.Turner at warwick.ac.uk
Tue Jun 21 15:33:31 CEST 2005


I think the warning condition in Robin's patch is too harsh - the following examples seem reasonable to me, but all produce warnings

sweep(array(1:24, dim = c(4,3,2)), 1, 1:2, give.warning = TRUE)
sweep(array(1:24, dim = c(4,3,2)), 1, 1:12, give.warning = TRUE)
sweep(array(1:24, dim = c(4,3,2)), 1, 1:24, give.warning = TRUE)

I have written an alternative (given below) which does not give warnings in the above cases, but does warn in the following case

> sweep(array(1:24, dim = c(4,3,2)), 1:2, 1:3)
, , 1

     [,1] [,2] [,3]
[1,]    0    3    6
[2,]    0    3    9
[3,]    0    6    9
[4,]    3    6    9

, , 2

     [,1] [,2] [,3]
[1,]   12   15   18
[2,]   12   15   21
[3,]   12   18   21
[4,]   15   18   21

Warning message:
STATS does not recycle exactly across MARGIN

The code could be easily modified to warn in other cases, e.g. when length of STATS is a divisor of the corresponding array extent (as in the first example above, with length(STATS) = 2).

The code also includes Gabor's suggestion.

Heather

sweep <- function (x, MARGIN, STATS, FUN = "-", warn = getOption("warn"), ...) 
{
    FUN <- match.fun(FUN)
    dims <- dim(x)
    perm <- c(MARGIN, (1:length(dims))[-MARGIN])
    if (warn >= 0) {
        s <- length(STATS)
        cumDim <- c(1, cumprod(dims[perm]))
        if (s > max(cumDim))
            warning("length of STATS greater than length of array",
                    call. = FALSE)
        else {
            upper <- min(ifelse(cumDim > s, cumDim, max(cumDim)))
            lower <- max(ifelse(cumDim < s, cumDim, min(cumDim)))
            if (any(upper %% s != 0, s %% lower != 0)) 
                warning("STATS does not recycle exactly across MARGIN",
                        call. = FALSE)
        }
    }
    FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
}

>>> Gabor Grothendieck <ggrothendieck at gmail.com> 06/21/05 01:25pm >>>
\
Perhaps the signature should be:

   sweep(...other args go here..., warn=getOption("warn"))

so that the name and value of the argument are consistent with
the R warn option.

On 6/21/05, Robin Hankin <r.hankin at noc.soton.ac.uk> wrote:
> 
> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
> 
> > The issue here is that the equivalent command array(1:5, c(6,6)) (to
> > matrix(1:5,6,6)) gives no warning, and sweep uses array().
> >
> > I am not sure either should: fractional recycling was normally allowed
> > in S3 (S4 tightened up a bit).
> >
> > Perhaps someone who thinks sweep() should warn could contribute a
> > tested patch?
> >
> 
> 
> OK,  modified R code and Rd file below (is this the best way to do
> this?)
> 
> 
> 
> 
> "sweep" <-
>   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
> {
>   FUN <- match.fun(FUN)
>   dims <- dim(x)
>   if(give.warning & length(STATS)>1 & any(dims[MARGIN] !=
> dim(as.array(STATS)))){
>     warning("array extents do not recycle exactly")
>   }
>   perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>   FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
> }
> 
> 
> 
> 
> 
> 
> 
> \name{sweep}
> \alias{sweep}
> \title{Sweep out Array Summaries}
> \description{
>   Return an array obtained from an input array by sweeping out a summary
>   statistic.
> }
> \usage{
> sweep(x, MARGIN, STATS, FUN="-", give.warning = FALSE, \dots)
> }
> \arguments{
>   \item{x}{an array.}
>   \item{MARGIN}{a vector of indices giving the extents of \code{x}
>     which correspond to \code{STATS}.}
>   \item{STATS}{the summary statistic which is to be swept out.}
>   \item{FUN}{the function to be used to carry out the sweep.  In the
>     case of binary operators such as \code{"/"} etc., the function name
>     must be quoted.}
>   \item{give.warning}{Boolean, with default \code{FALSE} meaning to
>   give no warning, even if array extents do not match.  If
>   \code{TRUE}, check for the correct dimensions and if a
>   mismatch is detected, give a suitable warning.}
>   \item{\dots}{optional arguments to \code{FUN}.}
> }
> \value{
>   An array with the same shape as \code{x}, but with the summary
>   statistics swept out.
> }
> \note{
>   If \code{STATS} is of length 1, recycling is carried out with no
>   warning irrespective of the value of \code{give.warning}.
> }
> 
> \references{
>   Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
>   \emph{The New S Language}.
>   Wadsworth \& Brooks/Cole.
> }
> \seealso{
>   \code{\link{apply}} on which \code{sweep} used to be based;
>   \code{\link{scale}} for centering and scaling.
> }
> \examples{
> require(stats) # for median
> med.att <- apply(attitude, 2, median)
> sweep(data.matrix(attitude), 2, med.att)# subtract the column medians
> 
> a <- array(0, c(2, 3, 4))
> b <- matrix(1:8, c(2, 4))
> sweep(a, c(1, 3), b, "+", give.warning = TRUE)  # no warning:
> all(dim(a)[c(1,3)] == dim(b))
> sweep(a, c(1, 2), b, "+", give.warning = TRUE)  # warning given
> 
> }
> \keyword{array}
> \keyword{iteration}
> 
> 
> 
> 
> --
> Robin Hankin
> Uncertainty Analyst
> National Oceanography Centre, Southampton
> European Way, Southampton SO14 3ZH, UK
>  tel  023-8059-7743
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html 
>




More information about the R-help mailing list