# [Rd] RFC: sapply() limitation from vector to matrix, but not further

Martin Maechler maechler at stat.math.ethz.ch
Wed Dec 1 09:39:01 CET 2010

```sapply() stems from S / S+ times and hence has a long tradition.
In spite of that I think that it should be enhanced...

As the subject mentions, sapply() produces a matrix in cases
where the list components of the lapply(.) results are of the
same length (and ...).
However, it unfortunately "stops there".
E.g., if you *nest* two sapply() calls where the inner one
produces a matrix, very often the logical behavior would be for
the outer sapply() to stack these matrices into an array of
rank 3 ["array rank"(x) := length(dim(x))].
However it does not do that, e.g., an artifical example

p0 <- function(...) paste(..., sep="")
myF <- function(x,y) {
stopifnot(length(x) <= 3)
x <- rep(x, length.out=3)
ny <- length(y)
r <- outer(x,y)
dimnames(r) <- list(p0("r",1:3), p0("C", seq_len(ny)))
r
}

and

> (v <- structure(10*(5:8), names=LETTERS[1:4]))
A  B  C  D
50 60 70 80

if we let sapply() not simplify, we see the list of same size
matrices it produes:

> sapply(v, myF, y = 2*(1:5), simplify=FALSE)
\$A
C1  C2  C3  C4  C5
r1 100 200 300 400 500
r2 100 200 300 400 500
r3 100 200 300 400 500

\$B
C1  C2  C3  C4  C5
r1 120 240 360 480 600
r2 120 240 360 480 600
r3 120 240 360 480 600

\$C
C1  C2  C3  C4  C5
r1 140 280 420 560 700
r2 140 280 420 560 700
r3 140 280 420 560 700

\$D
C1  C2  C3  C4  C5
r1 160 320 480 640 800
r2 160 320 480 640 800
r3 160 320 480 640 800

However, quite deceptively

> sapply(v, myF, y = 2*(1:5))
A   B   C   D
[1,] 100 120 140 160
[2,] 100 120 140 160
[3,] 100 120 140 160
[4,] 200 240 280 320
[5,] 200 240 280 320
[6,] 200 240 280 320
[7,] 300 360 420 480
[8,] 300 360 420 480
[9,] 300 360 420 480
[10,] 400 480 560 640
[11,] 400 480 560 640
[12,] 400 480 560 640
[13,] 500 600 700 800
[14,] 500 600 700 800
[15,] 500 600 700 800

My proposal -- implemented and "make check" tested --
is to add an optional argument  'ARRAY'
which allows

> sapply(v, myF, y = 2*(1:5), ARRAY=TRUE)
, , A

C1  C2  C3  C4  C5
r1 100 200 300 400 500
r2 100 200 300 400 500
r3 100 200 300 400 500

, , B

C1  C2  C3  C4  C5
r1 120 240 360 480 600
r2 120 240 360 480 600
r3 120 240 360 480 600

, , C

C1  C2  C3  C4  C5
r1 140 280 420 560 700
r2 140 280 420 560 700
r3 140 280 420 560 700

, , D

C1  C2  C3  C4  C5
r1 160 320 480 640 800
r2 160 320 480 640 800
r3 160 320 480 640 800

>
-----------

In the best of all worlds, the default would be 'ARRAY = TRUE',
but of course, given the long-standing different behavior,
it seem much too "risky", and my proposal includes remaining
back-compatible with default ARRAY = FALSE.

Martin Maechler,
ETH Zurich

```