[Rd] sapply improvements

Duncan Murdoch murdoch at stats.uwo.ca
Wed Nov 4 21:24:12 CET 2009


On 11/4/2009 12:15 PM, William Dunlap wrote:
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org 
>> [mailto:r-devel-bounces at r-project.org] On Behalf Of Duncan Murdoch
>> Sent: Wednesday, November 04, 2009 8:47 AM
>> To: michael.m.spiegel at gmail.com
>> Cc: R-bugs at r-project.org; r-devel at stat.math.ethz.ch
>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>> 
>> On 11/4/2009 11:05 AM, michael.m.spiegel at gmail.com wrote:
>> > Full_Name: Michael Spiegel
>> > Version: 2.10
>> > OS: Windows Vista
>> > Submission from: (NULL) (76.104.24.156)
>> > 
>> > 
>> > The following error is produced when attempting to call 
>> install.packages.  Here
>> > is the results of the traceback:
>> > 
>> >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
>> > Error in f(res) : invalid subscript type 'list'
>> >> traceback()
>> > 7: f(res)
>> > 6: available.packages(contriburl = contriburl, method = method)
>> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl = 
>> contriburl, 
>> >        method = method, available = available, destdir = destdir, 
>> >        dependencies = dependencies, ...)
>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
>> > 3: eval.with.vis(expr, envir, enclos)
>> > 2: eval.with.vis(ei, envir)
>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>> > 
>> > I've tracked the error down to somewhere in 
>> available.packages defined in
>> > src\library\utils\R\packages.R.  I am guessing that the 
>> error in version 2.10
>> > has something to do with the change: "available.packages() 
>> gains a 'filters'
>> > argument for specifying the filtering operations performed 
>> on the packages found
>> > in the repositories."
>> 
>> I've found the error, and will fix and commit to R-devel and 
>> R-patched.
>> 
>> For future reference:  the problem was that it assigned the result of 
>> sapply() to a subset of a vector.  Normally sapply() simplifies its 
>> result to a vector, but in this case the result was empty, so 
>> sapply() 
>> returned an empty list; assigning a list to a vector coerced 
>> the vector 
>> to a list, and then the "invalid subscript type 'list'" came 
>> soon after.
> 
> I've run into this sort of problem a lot (0-long input to sapply
> causes it to return list()).  A related problem is that when sapply's
> FUN doesn't always return the type of value you expect for some
> corner case then sapply won't do the expected simplication.  If
> sapply had an argument that gave the expected form of FUN's output
> then sapply could (a) die if some call to FUN didn't return something
> of that form and (b) return a 0-long object of the correct form
> if sapply's X has length zero so FUN is never called.  E.g.,
>    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
> third iteration
>    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
> logical(0)
> 
> Another benefit of sapply knowing the type of FUN's return value is
> that it wouldn't have to waste space creating a list of FUN's return
> values but could stuff them directly into the final output structure.
> A list of n scalar doubles is 4.5 times bigger than double(n) and the
> factor is 9.0 for integers and logicals.


What do you think of the behaviour of the sapply function below?  (I 
wouldn't put it into R as it is, I'd translate it to C code to avoid the 
lapply call; but I'd like to get the behaviour right before doing that.)

This one checks that the length() and typeof() results are consistent. 
If the FUN.VALUE has names, those are used (but it doesn't require the 
names from FUN to match).

Duncan Murdoch

sapply <- function(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE, 
FUN.VALUE)
{
     FUN <- match.fun(FUN)
     answer <- lapply(X, FUN, ...)
     if(USE.NAMES && is.character(X) && is.null(names(answer)))
                 names(answer) <- X
     if(simplify) {
     	if (missing(FUN.VALUE)) {
     	    if ((!length(answer))
     	        || length(common.len <- unique(unlist(lapply(answer, 
length)))) != 1L)
     	        return(answer)
     	    common.names <- names(answer[[1L]])
     	} else {
     	    common.len <- length(FUN.VALUE)
     	    common.type <- typeof(FUN.VALUE)
     	    common.names <- names(FUN.VALUE)
     	    if (length(answer)) {
     	    	if (any( unlist(lapply(answer, length)) != common.len ))
     	    	    stop(sprintf("%s values must be of length %d", "FUN", 
common.len))
     	    	if (any( unlist(lapply(answer, typeof)) != common.type ))
     	    	    stop(sprintf("%s values must be of type '%s'", "FUN", 
common.type))
     	    	if (is.null(common.names))
     	    	    common.names <- names(answer[[1L]])
     	    } else if (length(FUN.VALUE) > 1)
     	    	return(array(FUN.VALUE[0], dim=c(common.len, 0),
     	    	             dimnames= if(!is.null(common.names))
     	    	                       list(common.names,character(0))))
     	    else
     	    	return(FUN.VALUE[0])	
     	}
	if(common.len == 1L)
	    unlist(answer, recursive = FALSE)
	else if(common.len > 1L)
	    array(unlist(answer, recursive = FALSE),
		  dim= c(common.len, length(X)),
		  dimnames= if(!(is.null(common.names) &
			         is.null(n2 <- names(answer)))) list(common.names,n2))
	else answer
     } else answer
}



More information about the R-devel mailing list