[Rd] data frame subscription operator

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Nov 8 09:21:05 CET 2006


'[' is the 'subscript' or 'extraction', not 'subscription' operator: this 
is also called 'indexing', as in 'An Introduction to R'.

On Mon, 6 Nov 2006, Vladimir Dergachev wrote:

>   I was looking at the data frame subscription operator (attached in the end
> of this e-mail) and got puzzled by the following line:
>
>    class(x) <- attr(x, "row.names") <- NULL
>
> This appears to set the class and row.names attributes of the incoming data
> frame to NULL.

Actually no, it removes them: see ?attr and ?class.

> So far I was not able to figure out why this is necessary -
> could anyone help ?

You need to remove the class to avoid recursion: a few lines later x[i]
needs to be a call to the primitive and not the data frame method.

> The reason I am looking at it is that changing attributes forces duplication
> of the data frame and this is the largest cause of slowness of data.frames in
> general.

Do you have evidence of that?  R has facilities to profile its code, and I 
have never seen  [.data.frame taking a significant proportion of the total 
time.  If it does for your application, consider if a data frame is an 
appropriate way to store your data.  I am not sure we would accept that
data frames do have 'slowness in general', but their generality does make 
them slower than alternatives where the generality is not needed.

[...]

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list