[Rd] [R] unvectorized option for outer()

Liaw, Andy andy_liaw at merck.com
Mon Oct 31 14:12:35 CET 2005


> From: Thomas Lumley
> 
> On Sun, 30 Oct 2005, Jonathan Rougier wrote:
> 
> > I'm not sure about this.  Perhaps I am a dinosaur, but my feeling is
> > that if people are writing functions in R that might be subject to
> > simple operations like outer products, then they ought to be writing
> > vectorised functions!
> 
> I would agree.  How about an oapply() function that does 
> multiway (rather 
> than just two-way) outer products.  Basing the name on "apply" would 
> emphasize the similarity to other flexible, not particularly 
> optimized 
> second-order functions.
> 
>  	-thomas

I'll toss in my $0.02:  The following is my attempt at creating a "general
outer" that works with more than two arguments.

gouter <- function (x, FUN, ...) {
    xgrid <- as.list(do.call("expand.grid", x))
    names(xgrid) <- NULL
    array(do.call(deparse(substitute(FUN)), c(xgrid, list(...))), 
        dim = sapply(x, length), dimnames = x)
}


Here's an example:

> example(gouter)

gouter> gouter(list(letters[1:2], 1:3, 2:4), paste, sep = "")
, , 2

  1     2     3    
a "a12" "a22" "a32"
b "b12" "b22" "b32"

, , 3

  1     2     3    
a "a13" "a23" "a33"
b "b13" "b23" "b33"

, , 4

  1     2     3    
a "a14" "a24" "a34"
b "b14" "b24" "b34"

Andy

> 
> 
> >		Maybe it's not possible to hold this line, and
> > maybe "outer" is not the right place to draw it, but I 
> think we ought to
> > respect the "x is a vector" mindset as much as possible in the base
> > package.  As Tony notes, the documentation does try to be 
> clear about
> > what outer actually does, and how it can be used.
> >
> > So I am not a fan of the VECTORIZED argument, and 
> definitely not a fan
> > of the VECTORIZED=FALSE default.
> >
> > Jonathan.
> >
> > Gabor Grothendieck wrote:
> >> If the default were changed to VECTORIZED=FALSE then it would
> >> still be functionally compatible with what we have now so 
> all existing
> >> software would continue to run correctly yet would not cause
> >> problems for the unwary.  Existing software would not have 
> to be changed
> >> to add VECTORIZED=TRUE except for those, presumbly few, cases
> >> where outer performance is critical.   One optimization might be to
> >> have the default be TRUE if the function is * or perhaps 
> if its specified
> >> as a single character and FALSE otherwise.
> >>
> >> Having used APL I always regarded the original design of 
> outer in R as
> >> permature performance optimization and this would be a 
> chance to get
> >> it right.
> >>
> >> On 10/28/05, Tony Plate <tplate at acm.org> wrote:
> >>
> >>> [following on from a thread on R-help, but my post here seems more
> >>> appropriate to R-devel]
> >>>
> >>> Would a patch to make outer() work with non-vectorized 
> functions be
> >>> considered?  It seems to come up moderately often on the 
> list, which
> >>> probably indicates that many many people get bitten by the same
> >>> incorrect expectation, despite the documentation and the 
> FAQ entry.  It
> >>> looks pretty simple to modify outer() appropriately: one 
> extra function
> >>> argument and an if-then-else clause to call mapply(FUN, 
> ...) instead of
> >>> calling FUN directly.
> >>>
> >>> Here's a function demonstrating this:
> >>>
> >>> outer2 <- function (X, Y, FUN = "*", ..., VECTORIZED=TRUE)
> >>> {
> >>>    no.nx <- is.null(nx <- dimnames(X <- as.array(X)))
> >>>    dX <- dim(X)
> >>>    no.ny <- is.null(ny <- dimnames(Y <- as.array(Y)))
> >>>    dY <- dim(Y)
> >>>    if (is.character(FUN) && FUN == "*") {
> >>>        robj <- as.vector(X) %*% t(as.vector(Y))
> >>>        dim(robj) <- c(dX, dY)
> >>>    }
> >>>    else {
> >>>        FUN <- match.fun(FUN)
> >>>        Y <- rep(Y, rep.int(length(X), length(Y)))
> >>>        if (length(X) > 0)
> >>>            X <- rep(X, times = ceiling(length(Y)/length(X)))
> >>>        if (VECTORIZED)
> >>>            robj <- FUN(X, Y, ...)
> >>>        else
> >>>            robj <- mapply(FUN, X, Y, MoreArgs=list(...))
> >>>        dim(robj) <- c(dX, dY)
> >>>    }
> >>>    if (no.nx)
> >>>        nx <- vector("list", length(dX))
> >>>    else if (no.ny)
> >>>        ny <- vector("list", length(dY))
> >>>    if (!(no.nx && no.ny))
> >>>        dimnames(robj) <- c(nx, ny)
> >>>    robj
> >>> }
> >>> # Some examples
> >>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
> >>> outer2(1:2, 3:5, f, 2)
> >>> outer2(numeric(0), 3:5, f, 2)
> >>> outer2(1:2, numeric(0), f, 2)
> >>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
> >>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
> >>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
> >>>
> >>> # Output on examples
> >>>
> >>>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
> >>>> outer2(1:2, 3:5, f, 2)
> >>>
> >>> in f
> >>>     [,1] [,2] [,3]
> >>> [1,]    9   16   25
> >>> [2,]   36   64  100
> >>>
> >>>> outer2(numeric(0), 3:5, f, 2)
> >>>
> >>> in f
> >>>     [,1] [,2] [,3]
> >>>
> >>>> outer2(1:2, numeric(0), f, 2)
> >>>
> >>> in f
> >>>
> >>> [1,]
> >>> [2,]
> >>>
> >>>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
> >>>
> >>> in f
> >>> in f
> >>> in f
> >>> in f
> >>> in f
> >>> in f
> >>>     [,1] [,2] [,3]
> >>> [1,]    9   16   25
> >>> [2,]   36   64  100
> >>>
> >>>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
> >>>
> >>>     [,1] [,2] [,3]
> >>>
> >>>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
> >>>
> >>> [1,]
> >>> [2,]
> >>>
> >>> If a patch to add this feature would be considered, I'd 
> be happy to
> >>> submit one (including documentation).  If so, and if there are any
> >>> potential traps I should bear in mind, please let me know!
> >>>
> >>> -- Tony Plate
> >>>
> >>> Rau, Roland wrote:
> >>>
> >>>> Dear all,
> >>>>
> >>>> a big thanks to Thomas Lumley, James Holtman and Tony 
> Plate for their
> >>>> answers. They all pointed in the same direction => I 
> need a vectorized
> >>>> function to be applied. Hence, I will try to work with a 
> 'wrapper'
> >>>> function as described in the FAQ.
> >>>>
> >>>> Thanks again,
> >>>> Roland
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Thomas Lumley [mailto:tlumley at u.washington.edu]
> >>>>> Sent: Thursday, October 27, 2005 11:39 PM
> >>>>> To: Rau, Roland
> >>>>> Cc: r-help at stat.math.ethz.ch
> >>>>> Subject: Re: [R] outer-question
> >>>>>
> >>>>>
> >>>>> You want FAQ 7.17 Why does outer() behave strangely 
> with my function?
> >>>>>
> >>>>>     -thomas
> >>>>>
> >>>>> On Thu, 27 Oct 2005, Rau, Roland wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Dear all,
> >>>>>>
> >>>>>> This is a rather lengthy message, but I don't know what I
> >>>>>
> >>>>> made wrong in
> >>>>>
> >>>>>
> >>>>>> my real example since the simple code works.
> >>>>>> I have two variables a, b and a function f for which I 
> would like to
> >>>>>> calculate all possible combinations of the values of a and b.
> >>>>>> If f is multiplication, I would simply do:
> >>>>>>
> >>>>>> a <- 1:5
> >>>>>> b <- 1:5
> >>>>>> outer(a,b)
> >>>>>>
> >>>>>> ## A bit more complicated is this:
> >>>>>> f <- function(a,b,d) {
> >>>>>>    return(a*b+(sum(d)))
> >>>>>> }
> >>>>>> additional <- runif(100)
> >>>>>> outer(X=a, Y=b, FUN=f, d=additional)
> >>>>>>
> >>>>>> ## So far so good. But now my real example. I would 
> like to plot the
> >>>>>> ## log-likelihood surface for two parameters alpha and beta of
> >>>>>> ## a Gompertz distribution with given data
> >>>>>>
> >>>>>> ### I have a function to generate random-numbers from a
> >>>>>> Gompertz-Distribution
> >>>>>> ### (using the 'inversion method')
> >>>>>>
> >>>>>> random.gomp <- function(n, alpha, beta) {
> >>>>>>          return( (log(1-(beta/alpha*log(1-runif(n)))))/beta)
> >>>>>> }
> >>>>>>
> >>>>>> ## Now I generate some 'lifetimes'
> >>>>>> no.people <- 1000
> >>>>>> al <- 0.1
> >>>>>> bet <- 0.1
> >>>>>> lifetimes <- random.gomp(n=no.people, alpha=al, beta=bet)
> >>>>>>
> >>>>>> ### Since I neither have censoring nor truncation in this
> >>>>>
> >>>>> simple case,
> >>>>>
> >>>>>
> >>>>>> ### the log-likelihood should be simply the sum of the 
> log of the
> >>>>>> ### the densities (following the parametrization of
> >>>>>
> >>>>> Klein/Moeschberger
> >>>>>
> >>>>>
> >>>>>> ### Survival Analysis, p. 38)
> >>>>>>
> >>>>>> loggomp <- function(alphas, betas, timep) {
> >>>>>> return(sum(log(alphas) + betas*timep + (alphas/betas *
> >>>>>> (1-exp(betas*timep)))))
> >>>>>> }
> >>>>>>
> >>>>>> ### Now I thought I could obtain a matrix of the
> >>>>>
> >>>>> log-likelihood surface
> >>>>>
> >>>>>
> >>>>>> ### by specifying possible values for alpha and beta 
> with the given
> >>>>>> data.
> >>>>>> ### I was able to produce this matrix with two for-loops.
> >>>>>
> >>>>> But I thought
> >>>>>
> >>>>>
> >>>>>> ### I could use also 'outer' in this case.
> >>>>>> ### This is what I tried:
> >>>>>>
> >>>>>> possible.alphas <- seq(from=0.05, to=0.15, length=30)
> >>>>>> possible.betas <- seq(from=0.05, to=0.15, length=30)
> >>>>>>
> >>>>>> outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
> >>>>>
> >>>>> timep=lifetimes)
> >>>>>
> >>>>>
> >>>>>> ### But the result is:
> >>>>>>
> >>>>>>
> >>>>>>> outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
> >>>>>>
> >>>>>> timep=lifetimes)
> >>>>>> Error in outer(X = possible.alphas, Y = possible.betas, FUN
> >>>>>
> >>>>> = loggomp,
> >>>>>
> >>>>>
> >>>>>> :
> >>>>>>      dim<- : dims [product 900] do not match the length
> >>>>>
> >>>>> of object [1]
> >>>>>
> >>>>>
> >>>>>> In addition: Warning messages:
> >>>>>> ...
> >>>>>>
> >>>>>> ### Can somebody give me some hint where the problem is?
> >>>>>> ### I checked my definition of 'loggomp' but I thought this
> >>>>>
> >>>>> looks fine:
> >>>>>
> >>>>>
> >>>>>> loggomp(alphas=possible.alphas[1], betas=possible.betas[1],
> >>>>>> timep=lifetimes)
> >>>>>> loggomp(alphas=possible.alphas[4], betas=possible.betas[10],
> >>>>>> timep=lifetimes)
> >>>>>> loggomp(alphas=possible.alphas[3], betas=possible.betas[11],
> >>>>>> timep=lifetimes)
> >>>>>>
> >>>>>>
> >>>>>> ### I'd appreciate any kind of advice.
> >>>>>> ### Thanks a lot in advance.
> >>>>>> ### Roland
> >>>>>>
> >>>>>>
> >>>>>> +++++
> >>>>>> This mail has been sent through the MPI for Demographic
> >>>>>
> >>>>> Rese...{{dropped}}
> >>>>>
> >>>>>
> >>>>>> ______________________________________________
> >>>>>> R-help at stat.math.ethz.ch mailing list
> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>> PLEASE do read the posting guide!
> >>>>>
> >>>>> http://www.R-project.org/posting-guide.html
> >>>>>
> >>>>> Thomas Lumley                 Assoc. Professor, Biostatistics
> >>>>> tlumley at u.washington.edu      University of Washington, Seattle
> >>>>>
> >>>>
> >>>>
> >>>> +++++
> >>>> This mail has been sent through the MPI for Demographic 
> Research.  Should you receive a mail that is apparently from 
> a MPI user without this text displayed, then the address has 
> most likely been faked. If you are uncertain about the 
> validity of this message, please check the mail header or ask 
> your system administrator for assistance.
> >>>>
> >>>>
> >>>
> >>> ______________________________________________
> >>> R-help at stat.math.ethz.ch mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> >>>
> >>
> >>
> >> 
> ______________________________________________
> >> R-devel at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > -- 
> > Jonathan Rougier                       Science Laboratories
> > Department of Mathematical Sciences    South Road
> > University of Durham                   Durham DH1 3LE
> > tel: +44 (0)191 334 3111, fax: +44 (0)191 334 3051
> > http://www.maths.dur.ac.uk/stats/people/jcr/jcr.html
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> 
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>



More information about the R-devel mailing list