[Rd] .Call in R

Martin Morgan mtmorgan at fhcrc.org
Fri Nov 18 16:40:06 CET 2011


On 11/18/2011 07:08 AM, Karl Forner wrote:
> Hi,
>
> A probably very naive remark, but I believe that the probability of sum(
> runif(10000) )>= 50000 is exactly 0.5. So why not just test that, and
> generate the uniform values only if needed ?

My thought as well, but actually the deviates need to have mean > .5 so 
you'd do something like

   repeat {
      vecA <- runif(10000)
      if (mean(vecA) > .5) break
   }

You'd do this 1/2 the time, and you'd have to itearte on average 1 / 
(1/2) = 2 times before getting the vector satisfying the constraint, so 
the expected number of iterations is 1/2 * 2 = 1, the same as in the 
original implementation!

It does suggest that there is only one allocation required, if this were 
coded at the C level. But since sum(), mean(), and runif() all go more 
or less directly to C anyway it doesn't seem like this is the right 
problem for a C solution.

Martin

>
>
> Karl Forner
>
> On Thu, Nov 17, 2011 at 6:09 PM, Raymond<gwgc5 at mail.missouri.edu>  wrote:
>
>> Hi R developers,
>>
>>     I am new to this forum and hope someone can help me with .Call in R.
>> Greatly appreciate any help!
>>
>>     Say, I have a vector called "vecA" of length 10000, I generate a vector
>> called "vecR" with elements randomly generated from Uniform[0,1]. Both vecA
>> and vecR are of double type. I want to replace elements vecA by elements in
>> vecR only if sum of elements in vecR is greater than or equal to 5000.
>> Otherwise, vecR remain unchanged. This is easy to do in R, which reads
>>     vecA<-something;
>>     vecR<-runif(10000);
>>     if (sum(vecR)>=5000)){
>>        vecA<-vecR;
>>     }
>>
>>
>>     Now my question is, if I am going to do the same thing in R using .Call.
>> How can I achieve it in a more efficient way (i.e. less computation time
>> compared with pure R code above.).  My c code (called "change_vecA.c")
>> using
>> .Call is like this:
>>
>>     SEXP change_vecA(SEXP vecA){
>>          int i,vecA_len;
>>          double sum,*res_ptr,*vecR_ptr,*vecA_ptr;
>>
>>          vecA_ptr=REAL(vecA);
>>          vecA_len=length(vecA);
>>          SEXP res_vec,vecR;
>>
>>          PROTECT(res_vec=allocVector(REALSXP, vec_len));
>>          PROTECT(vecR=allocVector(REALSXP, vec_len));
>>          res_ptr=REAL(res_vec);
>>          vecR_ptr=REAL(vecR);
>>          GetRNGstate();
>>          sum=0.0;
>>          for (i=0;i<vecA_len;i++){
>>               vecR_ptr[i]=runif(0,1);
>>               sum+=vecR_ptr[i];
>>          }
>>          if (sum>=5000){
>>             /*copy vecR to the vector to be returned*/
>>             for (i=0;i<vecA_len;i++){
>>                   res_ptr[i]=vecR_ptr[i];
>>             }
>>          }
>>          else{
>>                 /*copy vecA to the vector to be returned*/
>>                 for (i=0;i<vecA_len;i++){
>>                       res_ptr[i]=vecA_ptr[i];
>>                 }
>>          }
>>
>>          PutRNGstate();
>>          UNPROTECT(2);
>>          resturn(res);
>> }
>> My R wrapper function is
>>         change_vecA<-function(vecA){
>>               dyn.load("change_vecA.so");
>>               .Call("change_vecA",vecA);
>>         }
>>
>>          Now my question is, due to two loops (one generates the random
>> vector and one determines the vector to be returned), can .Call still be
>> faster than pure R code (only one loop to copy vecR to vecA given condition
>> is met)? Or, how can I improve my c code to avoid redundant loops if any.
>> My
>> concern is if vecA is large (say of length 1000000 or even bigger), loops
>> in
>> C code can slow things down.  Thanks for any help!
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Call-in-R-tp4080721p4080721.html
>> Sent from the R devel mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the R-devel mailing list