[R] Managing output

Phil Spector spector at stat.berkeley.edu
Thu Aug 27 00:45:17 CEST 2009


Noah -
    Just allocate the maximum length that you'd ever need, and 
then change the length of the vector at the end of the program.
By the way, here's a little demonstration of what a difference
pre-allocation makes:

> system.time({x <- NULL;for(i in 1:10000)x <- c(x,rnorm(1))})
    user  system elapsed
   0.584   0.000   0.588 
> system.time({x <- numeric(10000);for(i in 1:10000)x[i] <- rnorm(1)})
    user  system elapsed
   0.120   0.000   0.122

The difference will be greater if you actually do something inside of
the loop.

To clarify my first point, use something like this:

> x = numeric(10000)
> j = 0
> for(i in 1:10000){
+     r = rnorm(1)
+     if(r < .1){
+       j = j + 1
+       x[j] = r
+       }
+     }
> length(x) = j

The overallocation doesn't actually slow things down:

> system.time({x <- numeric(10000);for(i in 1:10000)x[i] <- rnorm(1)})
    user  system elapsed
   0.120   0.000   0.122 
> system.time({x <- numeric(100000);for(i in 1:10000)x[i] <- rnorm(1);length(x) <- 10000})
    user  system elapsed
   0.128   0.000   0.126

                                                              - Phil



On Wed, 26 Aug 2009, Noah Silverman wrote:

> Phil,
>
> Pre-allocation makes sense.  However, I don't know the size of my resulting 
> vector when starting.  In my loop, I only pull off results that meet a 
> certain threshold.
>
> -N
>
> On 8/26/09 2:07 PM, Phil Spector wrote:
>> Noah -
>>    I would strongly advise you to preallocate the result vector
>> using numeric() or rep(), and then enter the values based on subscripts. 
>> Allowing objects to grow inside of loops is one of
>> the biggest mistakes an R programmer can make.
>>
>>                     - Phil Spector
>>                      Statistical Computing Facility
>>                      Department of Statistics
>>                      UC Berkeley
>>                      spector at stat.berkeley.edu
>> 
>> 
>> On Wed, 26 Aug 2009, Noah Silverman wrote:
>> 
>>> The actually process is REALLY complicate, I just gave a simple example
>>> for the list.
>>> 
>>> I have a  lot of steps to process the data before I get a final
>>> "score".  (nested loops, conditional statements, etc.)
>>> 
>>> Right now, I'm just printing the scores to the screen.  I'd like to
>>> accumulate them in some kind of data structure so I can either write
>>> them to disk or graph them.
>>> 
>>> -N
>>> 
>>> On 8/26/09 12:27 PM, Erik Iverson wrote:
>>>> How about ?append, but R is vectorized, so why not just
>>>> 
>>>> result_list<- 2*item^2 , or for more complicated tasks, the 
>>>> apply/sapply/lapply/mapply family of functions?
>>>> 
>>>> In general, the "for" loop construct can be avoided so you don't have to 
>>>> think about messy indexing.  What exactly are you trying to do?
>>>> 
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] 
>>>> On Behalf Of Noah Silverman
>>>> Sent: Wednesday, August 26, 2009 2:20 PM
>>>> To: r help
>>>> Subject: [R] Managing output
>>>> 
>>>> Hi,
>>>> 
>>>> 
>>>> Is there a way to build up a vector, item by item.  In perl, we can
>>>> "push" an item onto an array.  How can we can do this in R?
>>>> I have a loop that generates values as it goes.  I want to end up with a
>>>> vector of all the loop results.
>>>> 
>>>> In perl it woud be:
>>>> 
>>>> for(item in list){
>>>>       result<- 2*item^2 (Or whatever formula, this is just a pseudo 
>>>> example)
>>>>       Push(@result_list, result)  (This is the step I can't do in R)
>>>> }
>>>> 
>>>> 
>>>> Thanks!
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> 
>>>
>>>     [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>




More information about the R-help mailing list