[R] loop in a data.table

Camilo Mora cmora at dal.ca
Thu Mar 14 00:25:57 CET 2013


Hi everyone,

I have a data.table called "data" with many columns which I want to  
group by column1 using data.table, given how fast it is.

The problem with looping a data.table is that data.table does not like  
quotations  to define the column names (e.g. "col2" instead of col2).  
I found a way around which is to use get("col2"), which works fine but  
the processing time multiples by 20.

So if I use:

data[,sum(col2),by=(key)]

entering the column names by hand, the operation is done in 1 sec. but  
if in the contrary I use:

data[,sum(get("col2")),by=(key)]

using a loop to put the column names, the same operation takes 20 sec.  
I cannot use the former code because I have 100000 files to process  
but the later will simply take months to complete. Is there any  
alternative to the function "get" or any other way in which data.table  
con recognize the names of the columns?.

Thanks,
Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



More information about the R-help mailing list