[R] efficient conversion of matrix column rows to list elements

Charles C. Berry cberry at tajo.ucsd.edu
Wed Nov 17 21:10:42 CET 2010


On Wed, 17 Nov 2010, Chris Carleton wrote:

> Hi List,
>
> I'm hoping to get opinions for enhancing the efficiency of the following
> code designed to take a vector of probabilities (outcomes) and calculate a
> union of the probability space. As part of the union calculation, combn()
> must be used, which returns a matrix, and the parallelized version of
> lapply() provided in the multicore package requires a list. I've found that
> parallelization is very necessary for vectors of outcomes greater in length
> than about 10 or 15 elements, which is why I need to make use of multicore
> (and, therefore, convert the combn() matrix to a list). It would speed the
> process up if there was a more direct way to convert the columns of combn()
> to elements of a single list.


I think you are mistaken.

Is this what Rprof() tells you?

On my system, combn() is the culprit

> Rprof()
> outcomes <- 1:25
> nada <- replicate(200, {apply(combn(outcomes,2),2,column2list);NULL})
> Rprof(NULL)
> summaryRprof()
$by.self
           self.time self.pct total.time total.pct
"combn"        0.64    61.54       0.70     67.31
"apply"        0.20    19.23       1.04    100.00
"FUN"          0.10     9.62       1.04    100.00
"!="           0.04     3.85       0.04      3.85
"<"            0.02     1.92       0.02      1.92
"-"            0.02     1.92       0.02      1.92
"is.null"      0.02     1.92       0.02      1.92


And it hardly takes any time at that!


HTH,

Chuck

p.s. Isn't

 	as.data.frame( combn( outcomes, 2 ) )
or
 	combn(outcomes, 2, list )

good enough?


Any constructive suggestions will be greatly
> appreciated. Thanks for your consideration,
>
> C
>
> code:
> ------------
> unionIndependant <- function(outcomes) {
>    intsctn <- c()
>    column2list <- function(x){list(x)}
>    pb <-
> ProgressBar(max=length(outcomes),stepLength=1,newlineWhenDone=TRUE)
>    for (i in 2:length(outcomes)){
>        increase(pb)
>        outcomes_ <- apply(combn(outcomes,i),2,column2list)
>        for (j in 1:length(outcomes_)){outcomes_[[j]] <-
> outcomes_[[j]][[1]]}
>        outcomes_container <- mclapply(outcomes_,prod,mc.cores=3)
>        intsctn[i] <- sum(unlist(outcomes_container))
>    }
>    intsctn <- intsctn[-1]
>    return(sum(outcomes) - sum(intsctn[which(which((intsctn %in% intsctn))
> %% 2 == 1)]) + sum(intsctn[which(which((intsctn %in% intsctn)) %% 2 == 0)])
> + ((-1)^length(intsctn) * prod(outcomes)))
> }
> ------------
> PS This code has been tested on vectors of up to length(outcomes) == 25 and
> it should be noted that ProgressBar() requires the R.utils package.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list