[R] sapply returning list instead of matrix

chris warth cswarth at gmail.com
Mon Feb 3 07:54:39 CET 2014


Can I follow-up with what I've learned about my own myopia regarding
sapply()?

First, I appreciate all the feedback.   After thinking about it for a
while I realized R designers have often chosen to accommodate
interactive usage,  and in that context, sapply() returning different
types makes perfect sense.

If applying both 'mean' and 'var' to multiple data sets in a list, it
makes sense to return a matrix, but if applying just 'mean' the same
list of data sets it makes sense to return a list, not a 1xN matrix.
   This works well in an interactive context but when writing robust
applications, it is essential that routines return consistent types,
especially if the parameters are determined from unpredictable user
input.   The behavior of functions like sapply() in R seems
extraordinary compared to languages I am more familiar with like C,
Java, or Python.

In my case I was using sapply() to extract alignments from multiple
BAM files that overlap exons of a gene.    My application of sapply()
returned a matrix with data sets across columns and exons down the
rows.   This worked well for most genes, but failed when run on a gene
with only a single exon because sapply() returned a list instead of a
matrix.   This bug in my code was just waiting for the right set of
inputs to trigger it.

[ Some suggested using vapply() but don't think that would help in
this case because the length of the return value from the applied
function is variable and depends on how many exons are in the gene.
Or perhaps I just don't understand vapply well. ]

sapply() is behaving very similarly to the way the '[' and '[['
operators treat data frames.   The extract operator '[' returns a
vector when extracting a single column from a data frame,  otherwise
it returns a data frame.    However both '[' and '[[' take a 'drop'
parameter to control this behavior so you can get a consistent type
back if you need it.

I wish sapply() had a similar option.

-csw



More information about the R-help mailing list