[Rd] Lightweight data frame class

Gabor Grothendieck ggrothendieck at myway.com
Fri Nov 26 04:42:10 CET 2004


Vadim Ogranovich <vograno <at> evafunds.com> writes:

: 
: Hi,
: 
: As far as I can tell data.frame class adds two features to those of
: lists:
: * matrix structure via [,] and [,]<- operators  (well, I know these are
: actually "["(i, j, ...), not "[,]"). 
: * row names attribute.
: 
: It seems that the overhead of the support for the row names, both
: computational and RAM-wise, is rather non-trivial. I frequently
: subscript from a data.frame, i.e. use [,] on data frames, and my timing
: shows that the equivalent list operation is about 7 times faster, see
: below.
: 
: On the other hand, at least in my usage pattern, I really rarely benefit
: from the row names attribute, so as far as I am concerned row names is
: just an overhead. (Of course the speed difference may be due to other
: factors, the only thing I can tell is that subscripting is very slow in
: data frames relative to in lists).
: 
: I thought of writing a new class, say lightweight.data.frame, that would
: be polymorphic with the existing data.frame class. The class would
: inherit from "list" and implement [,], [,]<- operators. It would also
: implement the "rownames" function that would return seq(nrow(x)), etc.
: It should also implement as.data.frame to avoid the overhead of
: conversion to a full-blown data.frame in calls like lm(y ~ x,
: data=myLightweightDataframe).

The next version of zoo (currently in 
test) supports lists in the data argument of lm
and can also merge zoo series into a list (or
to another zoo series, as it does now).
Would that be a sufficient alternative?



More information about the R-devel mailing list