[R] lookup in R - possible to avoid loops?

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Mon Nov 8 21:59:01 CET 2010


Thanks a lot - extremely heplful!
While I'll definitely try to use merge in the future, in my situation
I run into problems with memory (files are too large).
However, Phil's suggestion is perfect for me - sped me up considerably!
Thank you, again!
Dimitri

On Mon, Nov 8, 2010 at 2:51 PM, Phil Spector <spector at stat.berkeley.edu> wrote:
> Dimitri -
>   While merge is most likely the fastest way to solve
> your problem, I just want to point out that you can use
> a named vector as a lookup table.  For your example:
>
> categories = my.lookup$category
> names(categories) = my.lookup$names
>
> creates the lookup table, and
>
> my.df$category = categories[my.df$names]
>
> creates the category column.
>                                           - Phil
>
>
>
> On Mon, 8 Nov 2010, Dimitri Liakhovitski wrote:
>
>> Hello!
>> Hope there is a nifty way to speed up my code by avoiding loops.
>> My task is simple - analogous to the vlookup formula in Excel. Here is
>> how I programmed it:
>>
>> # My example data frame:
>> set.seed(1245)
>>
>> my.df<-data.frame(names=rep(letters[1:3],3),value=round(rnorm(9,mean=20,sd=5),0))
>> my.df<-my.df[order(my.df$names),]
>> my.df$names<-as.character(my.df$names)
>> (my.df)
>>
>> # My example lookup table:
>> my.lookup<-data.frame(names=letters[1:3],category=c("AAA","BBB","CCC"))
>> my.lookup$names<-as.character(my.lookup$names)
>> my.lookup$category<-as.character(my.lookup$category)
>> (my.lookup)
>>
>> # Just adding an extra column to my.df that contains the categories of
>> the names in the column "names":
>> my.df2<-my.df
>> my.df2$category<-NA
>> for(i in unique(my.df$names)){
>>        my.df2$category[my.df2$names %in%
>> i]<-my.lookup$category[my.lookup$names %in% i]
>> }
>> (my.df2)
>>
>> It does what I need, but it's way too slow - I need to run it for
>> hundreds and hundreds of names in >100 of huge files (tens of
>> thousands of rows in each).
>> Any way to speed it up?
>>
>>
>> Thanks a lot!
>>
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com



More information about the R-help mailing list