[R] uniq -c

Sam Steingold sds at gnu.org
Tue Oct 16 18:29:52 CEST 2012


> * R. Michael Weylandt <zvpunry.jrlynaqg at tznvy.pbz> [2012-10-16 16:19:27 +0100]:
>
> Have you looked at using table() directly? If I understand what you
> want correctly something like:
>
> table(do.call(paste, x))

I wished to avoid paste (I will have to re-split later, so it will be a
performance nightmare).

> Also, if you take a look at the development version of R, changes are
> being put in place to allow much larger data sets.
>>
>> xtabs(), although dog slow, would have footed the bill nicely:
>> --8<---------------cut here---------------start------------->8---
>>> x <- data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32)
>>> system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 ))
>>    user  system elapsed
>>  12.788   4.288  17.224
>> --8<---------------cut here---------------end--------------->8---

you should not need "much larger data sets" for this.
x is sorted.

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://openvotingconsortium.org http://iris.org.il
http://www.memritv.org http://memri.org http://think-israel.org
Just because you're paranoid doesn't mean they AREN'T after you.




More information about the R-help mailing list