[R] how to efficiently compute set unique?

G FANG fanggangsw at gmail.com
Tue Jun 22 03:06:58 CEST 2010


I want to get the unique set from a large numeric k by 1 vector, k is
in tens of millions

when I used the matlab function unique, it takes less than 10 secs

but when I tried to use the unique in R with similar CPU and memory,
it is not done in minutes

I am wondering, am I using the function in the right way?

[1] 13584763        1
uniqueCntxt = unique(cntxtn);    # this is taking really long

Please advice.



