[R] Frequency count of Boolean pattern in 4 vectors.
wdunlap at tibco.com
Sun Jun 2 06:37:09 CEST 2013
For 10 million data points
table(interaction(vec_D, vec_C, vec_B, vec_A))
took my laptop 11.45 seconds and the following function required 0.18 seconds
f0 <- function (vec_A, vec_B, vec_C, vec_D)
x <- 1 + vec_A + 2 * (vec_B + 2 * (vec_C + 2 * vec_D))
tab <- tabulate(x, nbins = 16)
names(tab) <- do.call(paste0, rev(expand.grid(0:1, 0:1, 0:1,
Aside from the order of the entries in the output tables, they gave the same results.
Spotfire, TIBCO Software
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Sridhar Iyer
> Sent: Saturday, June 01, 2013 2:57 PM
> To: r-help at r-project.org
> Subject: [R] Frequency count of Boolean pattern in 4 vectors.
> I need to do this on very large datasets ( > a few million data points). So
> seeking help in figuring out an implementation of the task.
> Input 4 vectors which contain values as 0 or 1. (as integers, not boolean
> vec_A = ( 0, 1, 0, 0, ...... 1, 0, 1, 0) etc
> vec_B = (0,0,1,1.....)
> vec_C, vec_D (similar to above)
> All four vectors are same length.
> I need to compute frequency count of the boolean literals for DCBA,
> a) Is there a mechanism for combining the 4 vectors (in integer formats)
> into 4 bits of a new vector or some other
> type? (or treat them as boolean values true/false instead of 0 or 1
> b) what is the most efficient mechanism for obtaining the frequency count of
> each of the sixteen Boolean
> I need to do this frequently on large datasets. So am trying to get an
> efficient implementation (instead of
> a quick and dirty scheme). Thank you very very much in advance.
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help