[R] how to not sort factors when plotting

Gavin Simpson gavin.simpson at ucl.ac.uk
Mon May 12 18:54:53 CEST 2008


On Mon, 2008-05-12 at 11:41 -0400, Jorge Ivan Velez wrote:
> Hi Lydia,

[I'm struggling to see what this has to do with the subject line?]

> 
> I compared my ratio function with Dimitris and Phil's suggestions. Please do
> NOT use my approach because it's painfully slow for a large vector (as Phil
> told me). Here is why (using Win XP SP2, Intel Core- 2 Duo 2.4 GHz, R 2.7.0
> Patched):

Jorge,

If you pre-allocate storage for temp rather than the concatenation
approach you use, this is reasonably speedy for the example you quote:

Note: you have 1:length(x) which throws and error, the loop condition
needs to be 2:length(x) (or more robustly: seq_along(x[-1]) + 1)

> x <- rnorm(100000,0,1)
> my.ratio <- function(x){
+ temp <- numeric(length(x))
+ for(n in 2:length(x))
+ temp[n] <- x[n] / x[n-1]
+ temp
+ }
> system.time(my.ratio(x))
   user  system elapsed 
  0.757   0.000   0.758 
> new.ratio <- function(x) x[2:length(x)]/x[1:(length(x)-1)]
> system.time(new.ratio(x))
   user  system elapsed 
  0.011   0.003   0.013 

OK, so it isn't faster than the vectorised approach, but it isn't bad.
For those more familiar with C-type programming than the R vector
approach, you can do reasonably well with a for loop as long as you do
proper allocation of the result vector/object first.

Note that I'm not advocating that people shouldn't bother to learn to
use R to its advantages and code for R rather than as you might have
learnt from other languages (I'm not!), but for some problems a loop is
just fine unless you are the sort of person who needs that extra 0.7 of
a second to do something else with... ;-)

HTH

G

> 
> 
> # Vector
> x=rnorm(100000,0,1)
> 
> # Suggestion
> new.ratio=function(x) x[2:length(x)]/x[1:(length(x)-1)]
> 
> # My horrible function
> my.ratio=function(x){
>  temp=NULL
>  for (n in 1:length(x)) temp=c(temp,x[n]/x[n-1])
>  temp
>  }
> 
>  # System time
> t=system.time(my.ratio(x))
> tnr=system.time(new.ratio(x))
>  t
>    user  system elapsed
>   38.79    0.06   39.31
>  tnr
>    user  system elapsed
>       0       0       0
> 
> 
> Thanks to all,
> 
> Jorge
> 
> 
> 
> On Mon, May 12, 2008 at 11:15 AM, Phil Spector <spector at stat.berkeley.edu>
> wrote:
> 
> > Another alternative would be to take advantage of R's vectorization:
> >
> >  x=c(1,2,3,2,1,2,3)
> > > x[2:length(x)]/x[1:(length(x)-1)]
> > >
> > [1] 2.0000000 1.5000000 0.6666667 0.5000000 2.0000000 1.5000000
> >
> > The solution using your ratio function will be painfully slow
> > for a large vector.
> >
> >                                       - Phil Spector
> >                                         Statistical Computing Facility
> >                                         Department of Statistics
> >                                         UC Berkeley
> >                                         spector at stat.berkeley.edu
> >
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list