[Rd] Extreme slowdown with named vectors. A bug?

Henrik Bengtsson hb at stat.berkeley.edu
Sat Oct 7 08:11:04 CEST 2006


Thank *you* for identifying the source of the problem and fixing. :) /Henrik

On 10/6/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> On 10/6/2006 6:20 PM, Henrik Bengtsson wrote:
> > Tried the following with R --vanilla on the Rv2.4.0 release (see
> > details at the end).  I think the script and its comments speaks for
> > itself, but the outcome is certainly not wanted.
>
> I think this is fixed now in R-devel and R-patched.  Thanks for the
> report, and the detailed script to reproduce the bug.
>
> Duncan Murdoch
>
> >
> > for (n in 58950:58970) {
> >   cat("n=", n, "\n", sep="");
> >
> >   # Clean up first
> >   rm(names, x, y); gc();
> >
> >   # Create a named vector of length n
> >   # Try with format "%5d" and it works
> >   names <- sprintf("%05d", 1:n);
> >   x <- seq(along=names);
> >   names(x) <- names;
> >
> >   # Extract the first k elements
> >   k <- 36422;
> >   t0 <- system.time({
> >     y <- x[names[1:k]];
> >   })
> >   str(y);
> >
> >   # But with one more it takes
> >   # for ever when n >= 58960
> >   k <- k + 1;
> >   t1 <- system.time({
> >     y <- x[names[1:k]];
> >   })
> >   # ...then t1/t0 ~= 300-500 and growing!
> >   print(t1/t0);
> >   str(y);
> > }
> >
> >
> > The interesting this is that if you replace
> >
> >  y <- x[names[1:k]];
> >
> > with
> >
> >  idxs <- match(names[1:k], names(x));
> >  y <- x[idxs];
> >
> > everything is fine.
> >
> > (For those working with the Affy 100K SNP chips, the freaky thing is
> > that the problem occurs at n = 58960 which is exactly the number of
> > SNPs on the Xba array; that's how I found out about the bug/feature it
> > the first place).
> >
> > Tried this on two different systems:
> >
> >> sessionInfo()
> > R version 2.4.0 (2006-10-03)
> > i386-pc-mingw32
> > locale:
> > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> > States.1252;LC_MONETARY=English_United
> > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> > attached base packages:
> > [1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
> > [7] "base"
> >
> >> sessionInfo()
> > R version 2.4.0 (2006-10-03)
> > x86_64-unknown-linux-gnu
> > locale:
> > C
> > attached base packages:
> > [1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
> > [7] "base"
> >
> > Cheers
> >
> > /Henrik
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>




More information about the R-devel mailing list