[Rd] bug in apply with median -- fixed in R-devel (NULL == ..)

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Wed, 27 Sep 2000 10:37:42 +0200

>>>>> "LarAm" == Larry Ammann <ammann@metronet.com> writes:

    LarAm> I have found a problem in R version 1.1.1 when using apply with the
    LarAm> median function.
    LarAm> The problem can be illustrated with the following data matrix:

    LarAm> X1  X2  X3
    LarAm> 1     2     3
    LarAm> 4    5     6
    LarAm> 7    8    NA

    LarAm> Enter this data matrix as X and then try
    LarAm>	 apply(X,2,median,na.rm=T)

Reproducible by

 X <- matrix(c(1:8,NA),3,3, dimnames = list(1:3,paste("V",1:3,sep="")))

Thank you!

    LarAm> The problem here is that the median function returns a named
    LarAm> scalar if the number of observations is odd, but returns an
    LarAm> unnamed scalar if the number of observations is even. This
    LarAm> confuses the apply function in this case at:

    LarAm>    ans.names <- names(ans[[1]])
    LarAm>    if (!ans.list)
    LarAm>       ans.list <- any(unlist(lapply(ans, length)) != l.ans)
    LarAm>    if (!ans.list && length(ans.names)) {
    LarAm>       all.same <- sapply(ans, function(x) all(names(x) == ans.names))
    LarAm> #here is the offending line
    LarAm>    if (!all(all.same))
    LarAm>        ans.names <- NULL
    LarAm> }

Yes, your analysis is correct.

The reason this problem is now fixed (in R-devel, aka "1.2 unstable"),
is explained by the following entry in the BUG FIXES part of the ./NEWS
file :

    o	NULL == ... now gives logical(0) instead of an error.
	This fixes a bug with e.g. apply(X,2,median, na.rm = TRUE) and
	all(NULL == NULL) is now TRUE.

    LarAm> This problem does not occur with S-Plus. My quick solution was to use
    LarAm> the quantile function
    LarAm> instead of the median function:

    LarAm> apply(X,2,quantile,probs=.5,na.rm=T)

    LarAm> One way of fixing the problem then is to redefine median as

    LarAm> median <- function(x,na.rm=F,names=T)
    LarAm> quantile(x,probs=.5,na.rm=na.rm,names=names)

    LarAm> I don't know if this is a long-term solution though, since there may be
    LarAm> other functions with inconsistent
    LarAm> naming policies that can confuse apply as it is currently written.

and the above fix should fix all these as well!

    LarAm> Larry Ammann
    LarAm> Professor of Mathematical Sciences
    LarAm> University of Texas at Dallas

Thanks again!

Martin Maechler <maechler@stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO D10	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch