[Rd] data frame subscription operator
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Nov 8 09:21:05 CET 2006
'[' is the 'subscript' or 'extraction', not 'subscription' operator: this
is also called 'indexing', as in 'An Introduction to R'.
On Mon, 6 Nov 2006, Vladimir Dergachev wrote:
> I was looking at the data frame subscription operator (attached in the end
> of this e-mail) and got puzzled by the following line:
> class(x) <- attr(x, "row.names") <- NULL
> This appears to set the class and row.names attributes of the incoming data
> frame to NULL.
Actually no, it removes them: see ?attr and ?class.
> So far I was not able to figure out why this is necessary -
> could anyone help ?
You need to remove the class to avoid recursion: a few lines later x[i]
needs to be a call to the primitive and not the data frame method.
> The reason I am looking at it is that changing attributes forces duplication
> of the data frame and this is the largest cause of slowness of data.frames in
Do you have evidence of that? R has facilities to profile its code, and I
have never seen [.data.frame taking a significant proportion of the total
time. If it does for your application, consider if a data frame is an
appropriate way to store your data. I am not sure we would accept that
data frames do have 'slowness in general', but their generality does make
them slower than alternatives where the generality is not needed.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel