[R] How x[, 'colname1'] is implemented?

William Dunlap wdunlap at tibco.com
Fri Jan 1 00:07:36 CET 2010



Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Peng Yu
> Sent: Thursday, December 31, 2009 2:16 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] How x[, 'colname1'] is implemented?
> 
> I don't see where describes the implementation of '[]'.

You'd probably have to look in the source code for implementation
details like that.

> For example, if x is a matrix or a data.frame, how the lookup of
> 'colname1' is x[, 'colname1'] executed. Does R perform a lookup in the
> a hash of the colnames? Is the reference O(1) or O(n), where n is the
> second dim of x?

You can easily run timing tests in R by using system.time().
The sum of the first 2 components of its output gives the
CPU time.  E.g.,

  > f<-function(ncol){
       d<-data.frame(as.list(1:ncol))
       names(d)<-paste("Col",1:ncol)
       sum(system.time(for(i in 1:100)d['Col 1'])[1:2])
    }
  > z <- sapply(n<-2^(0:20), f)
  > z
   [1]  0.06  0.01  0.01  0.02  0.00  0.03
   [7]  0.02  0.01  0.01  0.03  0.02  0.02
  [13]  0.02  0.10  0.16  0.33  0.63  1.49
  [19]  3.22  8.35 18.91
  > plot(n, z, log="xy") # neither 0(1) nor O(ncol)

Compare the results to subscripting by number and see how fast
the column name to column number algorithm with various naming
schemes.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list