[Rd] 10x slower merge in mac 2.9.1 vs. 2.9.0 (PR#13890)

Simon Urbanek simon.urbanek at r-project.org
Thu Aug 13 17:52:09 CEST 2009


Rick,

I'm sorry, but I cannot reproduce it. You didn't supply sessionInfo()  
and the actual data, so all I can do is guess, but according to your  
description this test case shows no difference:

set.seed(1)
n=10000
d1 
= 
data 
.frame 
(seqn 
= 
as 
.integer 
(runif 
(n 
)*n 
),a 
= 
rnorm 
(n 
),b 
= 
rnorm 
(n 
),c 
= 
rnorm 
(n),d=rnorm(n),e=rnorm(n),f=rnorm(n),g=rnorm(n),h=rnorm(n),i=rnorm(n))
d2 
= 
data 
.frame 
(seqn 
= 
as 
.integer 
(runif 
(n 
)*n 
),a 
= 
rnorm 
(n 
),b 
= 
rnorm 
(n 
),c 
= 
rnorm 
(n),d=rnorm(n),e=rnorm(n),f=rnorm(n),g=rnorm(n),h=rnorm(n),i=rnorm(n))
system.time(merge(d1,d2,by="seqn",all.x=TRUE))

R 2.9.1:
 > system.time(merge(d1,d2,by="seqn",all.x=TRUE))
    user  system elapsed
   0.150   0.067   0.217

R 2.9.0:
 > system.time(merge(d1,d2,by="seqn",all.x=TRUE))
    user  system elapsed
   0.148   0.068   0.216

To substantiate your claim, please provide a reproducible example as  
well as sessionInfo() [and details on how you run it - GUI, CLI, ...],  
but I suspect the difference may be in your data, not R.

Thanks,
Simon


On Aug 12, 2009, at 12:25 , richard_stahlhut at urmc.rochester.edu wrote:

> Full_Name: Rick Stahlhut
> Version: 2.9.1
> OS: os x 10.5.7
> Submission from: (NULL) (128.151.71.23)
>
>
> I upgraded to 2.9.1 today from 2.9.0.   I work with large CDC  
> (center for
> disease control) datasets and start, frequently, with a series of 23  
> large-ish
> merges to create the final dataset I work on.  I do this each time  
> because (a) R
> is fast.  why not?   and b) the datasets occasionally get updated by  
> CDC and
> it's easier to swap in new files that way.
>
> One such merge is two data.frames with 10 variables and 10,000 rows  
> each.  The
> command in question is:
>
> temp = merge (demo.2,ph,by="seqn",all.x=TRUE)
>
> in 2.9.0, this command took 3.3 seconds.
> in 2.9.1, it took 35.8 seconds.
>
> I have reverted back to 2.9.0.
>
> Additional packages loaded are:
>
> library(Hmisc)
> library(alr3)
> library(epicalc)
> library(ggplot2)
> library(lattice)
> library(reshape)
> library(survey)
> library(car)
>
> thanks very much for all the effort.  R is wonderful.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



More information about the R-devel mailing list