[R] Looking for Speed in a Toy Simulation Example

R. Michael Weylandt michael.weylandt at gmail.com
Fri Jun 15 16:50:17 CEST 2012


On Fri, Jun 15, 2012 at 9:48 AM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> As of recent versions of R, you can actually go for what are
> officially recognized as "ultimate speed" functions .rowSums() and
> friends.

Sorry, perhaps that wasn't totally clear. Regarding .rowSums() note
that leading period. You pass this a slightly different set of
arguments (including matrix dims) but it goes straight down to C with
no code in R so it will be faster.

>
> You might also use the compiler() package to byte-compile that inner

Also, shouldn't have put parens after compiler.

Best,
Michael

> loop. [The function going to sapply] It won't be massive, but perhaps
> another 3 or 4x
>
> Michael
>
> On Fri, Jun 15, 2012 at 8:13 AM, Simon Knos
> <simon_mailing at quantentunnel.de> wrote:
>> Rui, thank you very much.
>>
>> I keep forgetting about the rowSum and friends. (precalculating the
>> powers just slipped my attention).
>>
>> And, yes, a factor of will of course do. Do you see a further
>> improvement in this case?
>>
>>
>> Best,
>>
>> Simon
>>
>> On Fri, Jun 15, 2012 at 12:25 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>> Hello,
>>>
>>> Will a factor of 4 do?
>>> This is variant 3, revised.
>>>
>>> #################################################
>>> ## Variant 3.b                                 ##
>>>
>>> #################################################
>>>
>>>
>>> ## Initialize matrix to hold results
>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>
>>> ## construct the deck to sample from
>>> basedeck <- rep(10^(1:4), 13)
>>> ## Pre-compute this vector, don't re-compute inside a loop
>>> pow10x5 <- 5*10^(1:4)
>>>
>>>
>>> ## This one uses matrix(...,5) to create the individual hands
>>> ## but it's created in advance
>>> currentdeck <- matrix(nrow = 5, ncol=noplayer)
>>>
>>>
>>> ## comparison by using %in%
>>> set.seed(7777)
>>> system.time({
>>>  singlecolor[] <- sapply(1:simlength, function(i){
>>>   currentdeck[] <- sample(basedeck, decklength)
>>>   colSums(currentdeck) %in% pow10x5
>>>  })
>>> })
>>> apply(singlecolor, 2, mean)  ## colMeans()
>>> mean(apply(singlecolor, 2, mean))
>>>
>>>
>>> Note that the real speed gain is in colSums, all the rest gave me around 1.5
>>> secs or 5% only.
>>>
>>> Rui Barradas
>>>
>>> Em 15-06-2012 09:40, Simon Knos escreveu:
>>>>
>>>> Dear List Members
>>>>
>>>>
>>>>
>>>> I used to play around with R to answer the following question by
>>>> simulation (I am aware there is an easy explicit solution, but this is
>>>> intended to serve as instructional example).
>>>>
>>>> Suppose you have a poker game with 6 players and a deck of 52 cards.
>>>> Compute the empirical frequencies of having a single-suit hand. The
>>>> way I want the result structured is a boolean nosimulation by noplayer
>>>> matrix containing true or false
>>>> depending whether the specific player was dealt a single-suit hand.
>>>> The code itself is quite short: 1 line to "deal the cards", 1 line to
>>>> check whether any of the six players has single-suit hand.
>>>>
>>>>
>>>> I played around with different variants (all found below) and managed
>>>> to gain some speed, however, I subjectively still find it quite slow.
>>>>
>>>> I would thus very much appreciate if anybody could point me to
>>>> a) speed improvments in general
>>>> b) speed improvements using the compiler package: At what level is
>>>> cmpfun best used in this particular example?
>>>>
>>>>
>>>>
>>>>
>>>> Thank you very much,
>>>>
>>>>
>>>> Simon
>>>>
>>>>
>>>> ###################################Code#########################################
>>>>
>>>> noplayer <- 6
>>>> simlength <- 1e+05
>>>> decklength <- 5 * noplayer
>>>>
>>>>
>>>>
>>>> #################################################
>>>> ## Variant 1                                   ##
>>>> #################################################
>>>>
>>>>
>>>>
>>>> ## Initialize matrix to hold results
>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>> ## construct the deck to sample from
>>>> basedeck <- rep(1:4, 13)
>>>> ## This one uses split to create the individual hands
>>>>
>>>> set.seed(7777)
>>>> system.time({
>>>>  for (i in 1:simlength) {
>>>>    currentdeck <- split(sample(basedeck, decklength), rep(1:noplayer, 5))
>>>>    singlecolor[i, ] <- sapply(currentdeck, function(inv) {
>>>> length(unique(inv)) == 1 })
>>>>  }
>>>> })
>>>> apply(singlecolor, 2, mean)
>>>> mean(apply(singlecolor, 2, mean))
>>>>
>>>>
>>>>
>>>> #################################################
>>>> ## Variant 2                                   ##
>>>> #################################################
>>>>
>>>>
>>>>
>>>> ## Initialize matrix to hold results
>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>
>>>> ## construct the deck to sample from
>>>> basedeck <- rep(10^(1:4), 13)
>>>>
>>>> ## This one uses matrix(...,5) to create the individual hands
>>>> ## comparison by using powers of ten
>>>> set.seed(7777)
>>>> system.time({
>>>>  for (i in 1:simlength) {
>>>>    sampledeck <- sample(basedeck, decklength)
>>>>    currentdeck <- matrix(sampledeck, nrow = 5)
>>>>    singlecolor[i, ] <- apply(currentdeck, 2, function(inv) {
>>>> any(sum(inv) == (5 * 10^(1:4))) })
>>>>  }
>>>> })
>>>> apply(singlecolor, 2, mean)
>>>> mean(apply(singlecolor, 2, mean))
>>>>
>>>>
>>>> #################################################
>>>> ## Variant 3                                   ##
>>>> #################################################
>>>>
>>>>
>>>> ## Initialize matrix to hold results
>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>
>>>> ## construct the deck to sample from
>>>> basedeck <- rep(10^(1:4), 13)
>>>>
>>>> ## This one uses matrix(...,5) to create the individual hands
>>>> ## comparison by using %in%
>>>> set.seed(7777)
>>>> system.time({
>>>>  for (i in 1:simlength) {
>>>>    sampledeck <- sample(basedeck, decklength)
>>>>    currentdeck <- matrix(sampledeck, nrow = 5)
>>>>    singlecolor[i, ] <- apply(currentdeck, 2, sum) %in% (5 * 10^(1:4))
>>>>  }
>>>> })
>>>> apply(singlecolor, 2, mean)
>>>> mean(apply(singlecolor, 2, mean))
>>>>
>>>>
>>>> #################################################
>>>> ## Variant 4                                   ##
>>>> #################################################
>>>>
>>>>
>>>>
>>>> ## Initialize matrix to hold results
>>>> singlecolor <- matrix(NA, simlength, noplayer)
>>>>
>>>> ## construct the deck to sample from
>>>> basedeck <- rep(1:4, 13)
>>>>
>>>> ## This one uses matrix(...,5) to create the individual hands
>>>> ## comparison by using length(unique(...))
>>>> set.seed(7777)
>>>> system.time({
>>>>  for (i in 1:simlength) {
>>>>    sampledeck <- sample(basedeck, decklength)
>>>>    currentdeck <- matrix(sampledeck, nrow = 5)
>>>>    singlecolor[i, ] <- apply(currentdeck, 2, function(inv) {
>>>> length(unique(inv)) == 1 })
>>>>  }
>>>> })
>>>> apply(singlecolor, 2, mean)
>>>> mean(apply(singlecolor, 2, mean))
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list