[R] Looking for Speed in a Toy Simulation Example

Rui Barradas ruipbarradas at sapo.pt
Fri Jun 15 21:22:26 CEST 2012


Hello,

With Michael's sugestions (I keep forgeting package compiler)


t0 <- system.time({ ... variant 3.b ...

#################################################
## Variant 3.c                                 ##
#################################################

install.packages("compiler")
library(compiler)

basedeck <- rep(10^(1:4), 13)
pow10x5 <- 5*10^(1:4)
currentdeck <- matrix(nrow = 5, ncol=noplayer)

singlecolor1 <- matrix(NA, simlength, noplayer)
singlecolor2 <- matrix(NA, simlength, noplayer)

## sum by .colSums
f1 <- function(){
  sapply(1:simlength, function(i){
    currentdeck[] <- sample(basedeck, decklength)
    .colSums(currentdeck, 5, noplayer) %in% pow10x5
  })
}

f2 <- cmpfun(f1)  # cmpfun: compile function
is.function(f2)   # TRUE

set.seed(7777)
t1 <- system.time( singlecolor1[] <- f1() )

set.seed(7777)
t2 <- system.time( singlecolor2[] <- f2() )

identical(singlecolor1, singlecolor2)
rbind(v3.b=t0, v3.c.1=t1, v3.c.2=t2, factor=t0/t2)


Not another 3 or 4x but faster.

Rui Barradas

Em 15-06-2012 15:50, R. Michael Weylandt escreveu:
> On Fri, Jun 15, 2012 at 9:48 AM, R. Michael Weylandt
> <michael.weylandt at gmail.com> wrote:
>> As of recent versions of R, you can actually go for what are
>> officially recognized as "ultimate speed" functions .rowSums() and
>> friends.
>
> Sorry, perhaps that wasn't totally clear. Regarding .rowSums() note
> that leading period. You pass this a slightly different set of
> arguments (including matrix dims) but it goes straight down to C with
> no code in R so it will be faster.
>
>>
>> You might also use the compiler() package to byte-compile that inner
>
> Also, shouldn't have put parens after compiler.
>
> Best,
> Michael
>
>> loop. [The function going to sapply] It won't be massive, but perhaps
>> another 3 or 4x
>>
>> Michael
>>
>> On Fri, Jun 15, 2012 at 8:13 AM, Simon Knos
>> <simon_mailing at quantentunnel.de> wrote:
>>> Rui, thank you very much.
>>>
>>> I keep forgetting about the rowSum and friends. (precalculating the
>>> powers just slipped my attention).
>>>
>>> And, yes, a factor of will of course do. Do you see a further
>>> improvement in this case?
>>>
>>>
>>> Best,
>>>
>>> Simon
>>>
>>> On Fri, Jun 15, 2012 at 12:25 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>>> Hello,
>>>>
>>>> Will a factor of 4 do?
>>>> This is variant 3, revised.
>>>>
>>>> #################################################
>>>> ## Variant 3.b                                 ##
>>>>
>>>> #################################################
>>>>
>>>>
>>>> ## Initialize matrix to hold results
>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>
>>>> ## construct the deck to sample from
>>>> basedeck <- rep(10^(1:4), 13)
>>>> ## Pre-compute this vector, don't re-compute inside a loop
>>>> pow10x5 <- 5*10^(1:4)
>>>>
>>>>
>>>> ## This one uses matrix(...,5) to create the individual hands
>>>> ## but it's created in advance
>>>> currentdeck <- matrix(nrow = 5, ncol=noplayer)
>>>>
>>>>
>>>> ## comparison by using %in%
>>>> set.seed(7777)
>>>> system.time({
>>>>   singlecolor[] <- sapply(1:simlength, function(i){
>>>>    currentdeck[] <- sample(basedeck, decklength)
>>>>    colSums(currentdeck) %in% pow10x5
>>>>   })
>>>> })
>>>> apply(singlecolor, 2, mean)  ## colMeans()
>>>> mean(apply(singlecolor, 2, mean))
>>>>
>>>>
>>>> Note that the real speed gain is in colSums, all the rest gave me around 1.5
>>>> secs or 5% only.
>>>>
>>>> Rui Barradas
>>>>
>>>> Em 15-06-2012 09:40, Simon Knos escreveu:
>>>>>
>>>>> Dear List Members
>>>>>
>>>>>
>>>>>
>>>>> I used to play around with R to answer the following question by
>>>>> simulation (I am aware there is an easy explicit solution, but this is
>>>>> intended to serve as instructional example).
>>>>>
>>>>> Suppose you have a poker game with 6 players and a deck of 52 cards.
>>>>> Compute the empirical frequencies of having a single-suit hand. The
>>>>> way I want the result structured is a boolean nosimulation by noplayer
>>>>> matrix containing true or false
>>>>> depending whether the specific player was dealt a single-suit hand.
>>>>> The code itself is quite short: 1 line to "deal the cards", 1 line to
>>>>> check whether any of the six players has single-suit hand.
>>>>>
>>>>>
>>>>> I played around with different variants (all found below) and managed
>>>>> to gain some speed, however, I subjectively still find it quite slow.
>>>>>
>>>>> I would thus very much appreciate if anybody could point me to
>>>>> a) speed improvments in general
>>>>> b) speed improvements using the compiler package: At what level is
>>>>> cmpfun best used in this particular example?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thank you very much,
>>>>>
>>>>>
>>>>> Simon
>>>>>
>>>>>
>>>>> ###################################Code#########################################
>>>>>
>>>>> noplayer <- 6
>>>>> simlength <- 1e+05
>>>>> decklength <- 5 * noplayer
>>>>>
>>>>>
>>>>>
>>>>> #################################################
>>>>> ## Variant 1                                   ##
>>>>> #################################################
>>>>>
>>>>>
>>>>>
>>>>> ## Initialize matrix to hold results
>>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>> ## construct the deck to sample from
>>>>> basedeck <- rep(1:4, 13)
>>>>> ## This one uses split to create the individual hands
>>>>>
>>>>> set.seed(7777)
>>>>> system.time({
>>>>>   for (i in 1:simlength) {
>>>>>     currentdeck <- split(sample(basedeck, decklength), rep(1:noplayer, 5))
>>>>>     singlecolor[i, ] <- sapply(currentdeck, function(inv) {
>>>>> length(unique(inv)) == 1 })
>>>>>   }
>>>>> })
>>>>> apply(singlecolor, 2, mean)
>>>>> mean(apply(singlecolor, 2, mean))
>>>>>
>>>>>
>>>>>
>>>>> #################################################
>>>>> ## Variant 2                                   ##
>>>>> #################################################
>>>>>
>>>>>
>>>>>
>>>>> ## Initialize matrix to hold results
>>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>>
>>>>> ## construct the deck to sample from
>>>>> basedeck <- rep(10^(1:4), 13)
>>>>>
>>>>> ## This one uses matrix(...,5) to create the individual hands
>>>>> ## comparison by using powers of ten
>>>>> set.seed(7777)
>>>>> system.time({
>>>>>   for (i in 1:simlength) {
>>>>>     sampledeck <- sample(basedeck, decklength)
>>>>>     currentdeck <- matrix(sampledeck, nrow = 5)
>>>>>     singlecolor[i, ] <- apply(currentdeck, 2, function(inv) {
>>>>> any(sum(inv) == (5 * 10^(1:4))) })
>>>>>   }
>>>>> })
>>>>> apply(singlecolor, 2, mean)
>>>>> mean(apply(singlecolor, 2, mean))
>>>>>
>>>>>
>>>>> #################################################
>>>>> ## Variant 3                                   ##
>>>>> #################################################
>>>>>
>>>>>
>>>>> ## Initialize matrix to hold results
>>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>>
>>>>> ## construct the deck to sample from
>>>>> basedeck <- rep(10^(1:4), 13)
>>>>>
>>>>> ## This one uses matrix(...,5) to create the individual hands
>>>>> ## comparison by using %in%
>>>>> set.seed(7777)
>>>>> system.time({
>>>>>   for (i in 1:simlength) {
>>>>>     sampledeck <- sample(basedeck, decklength)
>>>>>     currentdeck <- matrix(sampledeck, nrow = 5)
>>>>>     singlecolor[i, ] <- apply(currentdeck, 2, sum) %in% (5 * 10^(1:4))
>>>>>   }
>>>>> })
>>>>> apply(singlecolor, 2, mean)
>>>>> mean(apply(singlecolor, 2, mean))
>>>>>
>>>>>
>>>>> #################################################
>>>>> ## Variant 4                                   ##
>>>>> #################################################
>>>>>
>>>>>
>>>>>
>>>>> ## Initialize matrix to hold results
>>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>>
>>>>> ## construct the deck to sample from
>>>>> basedeck <- rep(1:4, 13)
>>>>>
>>>>> ## This one uses matrix(...,5) to create the individual hands
>>>>> ## comparison by using length(unique(...))
>>>>> set.seed(7777)
>>>>> system.time({
>>>>>   for (i in 1:simlength) {
>>>>>     sampledeck <- sample(basedeck, decklength)
>>>>>     currentdeck <- matrix(sampledeck, nrow = 5)
>>>>>     singlecolor[i, ] <- apply(currentdeck, 2, function(inv) {
>>>>> length(unique(inv)) == 1 })
>>>>>   }
>>>>> })
>>>>> apply(singlecolor, 2, mean)
>>>>> mean(apply(singlecolor, 2, mean))
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list