[R] Fwd: Long for Loop- calling C from R - Parallel Computing

jim holtman jholtman at gmail.com
Tue Oct 6 18:44:41 CEST 2009


It would help to understand the problem you are trying to solve and
the constraints that you have to live under.  I routinely process
files with millions of rows of data, do a lot of processing and create
graphics/reports from them in what I think is reasonable time (a
couple of minutes at most for the complex stuff) all within R without
having to write C/FORTRAN.  I am not sure what assumptions you are
currently operating under, but it would be good to state them so that
we know how to reply to the question that you are asking.

There are performance tips that can be provided if we knew what you
were trying to do.

On Tue, Oct 6, 2009 at 10:10 AM, Antonio Paredes
<antonioparedes14 at gmail.com> wrote:
> ---------- Forwarded message ----------
> From: Antonio Paredes <antonioparedes14 at gmail.com>
> Date: Tue, Oct 6, 2009 at 9:41 AM
> Subject: Re: [R] Long for Loop- calling C from R - Parallel Computing
> To: Karl Ove Hufthammer <karl at huftis.org>
>
>
> Hello again,
>
> I'm hoping to get a response from some of the R gurus in this list. Is my
> assumption that R is not designed or build to deal with high levels (a lots
> of simulated data) simulation correct. For example, how to minimize system
> time; do one have to call a lower level language like C or Fortran; or just,
> like many of you have done,  do a lots of programing in R and eventually the
> tricks will be learned.
>
> Thanks
>
> On Mon, Oct 5, 2009 at 8:35 AM, Antonio Paredes
> <antonioparedes14 at gmail.com>wrote:
>
>> In my case it does, because I need to preserved a "high level" of
>> independence (lack of correlation) among the different groups of 60. Also,
>> when I say final result I mean computation of standard errors and that
>> source of stuff; sorry about the lack clarity in my statement.
>>
>>
>> On Mon, Oct 5, 2009 at 9:48 AM, Karl Ove Hufthammer <karl at huftis.org>wrote:
>>
>>> In article <6f6f0fd60910050629p28c99209jcd7836353fd2d754
>>> @mail.gmail.com>, antonioparedes14 at gmail.com says...
>>> > I'm running the following for loop to generate random variables in
>>> chunks of
>>> > 60 at a time (l), here h is of order in millions (could be 5 to 6
>>> millions),
>>> > note that generating all the variables at once could have an impact on
>>> the
>>> > final results
>>>
>>> No, it will not. See this example code for an illustration:
>>>
>>> set.seed(1)
>>> rnorm(3)
>>> rnorm(3)
>>> set.seed(1)
>>> rnorm(6)
>>>
>>> So if you generate the six numbers three at a time or all at once gives
>>> exactly the same result.
>>>
>>> So my suggestion is to generate all the numbers at once. That takes next
>>> to no time. Or, if it takes too much memory, generate for example a
>>> million at once, and repeat a few times.
>>>
>>> --
>>> Karl Ove Hufthammer
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> -Tony
>>
>
>
>
> --
> -Tony
>
>
>
> --
> -Tony
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list