[Rd] Five functions proposed for base

Arni Magnusson arnima at u.washington.edu
Thu Aug 21 10:13:16 MEST 2003


Good points, Martin. Thanks for looking at this.

I have to admit I had only briefly looked at str() and ls.str() until I
read your message. I guess they seemed to me more like a tool for
developers than for end users, but now I realize str() is a valid
contender with dim(), summary.data.frame(), describe(), and now elem() to
view data frames 'from the outside'.

Granted, different users are looking for different information. I often
use dim(), summary(), and describe(), so I implemented elem() in a way
that wouldn't overlap with those. Compactness and ease of reading, given
my coho data frame (1768x10) is

dim(coho): 6 non-whitespace characters, 1 line
elem(coho): 184, 12
str(coho): 543, 11
summary(coho): 688, 8
describe(coho): 2242, 70

The main reason I include element size (sorry about the misspelling of kB)
is simply that it's not provided recursively by any other function, yet
the user might be interested. It makes it easier to evaluate how much the
data containers could be shrunk by removing certain elements, coercing to
other storage modes, or using a matrix. Mainly relevant for very large
datasets, and perhaps beginners who are familiarizing themselves with
different storage modes.

The object size is more important in the ll() output. When I tidy my
workspace, the heavy objects are often the first to go, but sometimes vice
versa (using the keep() function). It also helps me spot major data frames
and models in a sea of objects. One way to avoid the unwanted ll() name of
the function would be to take the Unix analogy all the way and add an
argument to ls(). Those who want could then define ll <- function(...)
ls(..., long=TRUE) or something along those lines. Compactness stats are

ls(): 74, 1
ll(): 156, 12
ll(dim=T): 197, 12
ls.str(max.level=-1): 353, 12

Discussing the core info-utilities in R is worth the time. Perhaps others
will comment on what information they look for, and how they go about
getting it. My only feedback so far is from colleagues who are using my
functions.

Cheers,
Arni



On Thu, 21 Aug 2003, Martin Maechler wrote:

> Thank you, Arni.
>
> Note that "base" already has
>   str()  which seems more useful than elem()
> 	 -- at least for human inspection, str() does not produce output;
>
>   ls.str() building on str() which is somewhat
>     related to your ll() {the name of which would be too short for R base}.
>
>   As a matter of fact, for a really compact ls.str() output, I'd
>   have to change the default value of 'max.level = 0' to '-1' and
>   change  'max.level = 0' to mean no recursion into the list
>   structure at all.
>
> Why is it important for you to know sizes in kB or MB, instead
> of just length/dim ?   Note that object.size() basically
> recursively builds on length() {and typeof()}, but still is
> only approximative.
>
> Regards,
> Martin
>
> Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
> Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
> ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
> phone: x-41-1-632-3408		fax: ...-1228			<><
>



More information about the R-devel mailing list