[R] Learn Vectorization (Vectorize)

Bert Gunter gunter.berton at gene.com
Tue Jan 25 18:24:30 CET 2011


Inline below.

-- Bert

On Tue, Jan 25, 2011 at 7:48 AM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote:
> Try this:
>
> expand.grid(seq(startpos, endpos, by = diff(c(startpos, endpos)) /
> nrow(sr)),
>          seq(startpos, endpos, by = diff(c(startpos, endpos)) / nrow(sr)))
>
> On Tue, Jan 25, 2011 at 1:29 PM, Alaios <alaios at yahoo.com> wrote:
>
>> Greetings Friends,
>> I would be grateful if you can help me undestand how to make my R code more
>> efficiently.
>>
>> I have read in R intoductory tutorial that a for loop is not used so ofter
>> (and is not maybe not that efficient) compared to other languages.
>>
>> So I am trying to build understanding how to get the equivalent of a for
>> loop using more R-oriented thinking.
>>
>> If I got it right one way to do that in R is Vectorize.

-- You got it wrong. Vectorize() is just another (disguised) way of
doing R level looping, via apply functions.It offers no efficiency
advantage over explicit for loops, though it may make for cleaner
programming, which IS a great advantage in the bigger scheme of
things, I admit. But thatwasn't your question.

The basic idea of vectorization in R is to use **built-in** commands
that act on whole objects through their internal (usually C) code.
These **will** be much faster than looping in R.

So for example to generate 100000 random numbers one can do:

> system.time(x <-rnorm(1e5))
   user  system elapsed
   0.02    0.00    0.02


## OR

> system.time({
+                       y <- numeric(1e5)
+                       for(i in 1:(1e5))y[i] <- rnorm(1)
+                       })
   user  system elapsed
   0.86    0.01    0.87


The former is vectorized, producing all 100000 randoms at a go and
runs 40 times faster than the latter, which is not, instead producing
one random number at a time (and requiring the overhead of a separate
call for each, I think).

Please read an Intro to R for more. V&R's MASS also has some helpful
remarks on vectorization. I'm sure that other texts (e.g. Dalgaard's)
are worth consulting on this, too, but I haven't read them.

-




 So I have writen a
>> small snippet of code with two nested for loops. I would be grateful if you
>> can help me find the equivalent of this code using Vectorize (or any other R
>> thinking)
>>
>> My code takes as input a n*m matrix and prints the x,y coordinates where a
>> cells starts and ends.
>>
>>
>>
>> remap <- function (sr){
>> # Input this funcion takes as arguments
>> # sr: map
>>  startpos<- -1 #
>>  endpos<- +1 #
>>  stepx<- (endpos - (startpos)) / nrow(sr)
>>  stepy<- (endpos - (startpos)) / ncol(sr)
>>
>>  for (i in seq(from=-1,to=1,by=stepx) ) {
>>    for (j in seq(from=-1,to=1,by=stepx) ){
>>       cat(' \n',i,j)
>>    }
>>
>>  }
>> }
>> sr<-matrix(data=seq(from=1,to=9),nrow=3,ncol=3,byrow=TRUE)
>> remap(sr)
>>
>> Regards
>> Alex
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml



More information about the R-help mailing list