[R] complexity of operations in R

Wed Jul 18 10:49:24 CEST 2012

Hello,

Em 18-07-2012 09:06, Patrick Burns escreveu:
> It looks to me like the following
> should do what you want:
>
> f2 <- function(dotot) array(FALSE, c(dotot, 1))
>
> What am I missing?

That matrix is even faster?

f2 <- function(dotot) array(FALSE, c(dotot, 1))
f3 <- function(dotot) matrix(FALSE, dotot, 1)

t2 <- system.time(replicate(1e4, f2(1000)))
t3 <- system.time(replicate(1e4, f3(1000)))
rbind(t2, t3, t2/t3)

Rui Barradas
>
> Pat
>
> On 17/07/2012 21:58, Johan Henriksson wrote:
>> thanks for the link! I should read it through. that said, I didn't find
>> any good general solution to the problem so here I post some attempts
>> for general input. maybe someone knows how to speed this up. both my
>> solutions are theoretically O(n) for creating a list of n elements. The
>> function to improve is O(n^2) which should suck tremendously - but the
>> slow execution of R probably blows up the constant factor of the smarter
>> solutions.
>>
>> Array doubling comes close in speed for large lists but it would be
>> great if it could be comparable for smaller lists. One hidden cost I see
>> directly is that allocating a list in R is O(n), not O(1) (or close),
>> since it always fills it with values. Is there a way around this? I
>> guess by using C, one could just malloc() and leave the content
>> undefined - but is there no better way?
>>
>> thanks,
>> /Johan
>>
>>
>> ################################
>> # the function we wish to improve
>>
>> f<-function(dotot){
>>    v<-matrix(ncol=1,nrow=0)
>>    for(i in 1:dotot){
>>      v<-rbind(v,FALSE)
>>    }
>>    return(v)
>> }
>>
>> ##########################
>> # first attempt: linked lists
>>
>> emptylist <- NA
>>
>> addtolist <- function(x,prev){
>>    return(list(x,prev))
>> }
>>
>> g<-function(dotot){
>>    v<-emptylist
>>    for(i in 1:dotot){
>>      v<-addtolist(FALSE,v)
>>    }
>>    return(v)
>> }
>>
>> ####################################
>> # second attempt: array doubling
>>
>> emptyexpandlist<-list(nelem=0,l=matrix(ncol=1,nrow=0))
>>
>> addexpandlist<-function(x,prev){
>>    if(nrow(prev$l)==prev$nelem){
>>      nextsize<-max(nrow(prev$l),1)
>>      prev$l<-rbind(prev$l,matrix(ncol=1,nrow=nextsize))
>>    }
>>    prev$nelem<-prev$nelem+1
>>    prev$l[prev$nelem]<-x
>>    return(prev)
>> }
>>
>> compressexpandlist<-function(prev){
>>    return(as.vector(prev$l[1:prev$nelem]))
>> }
>>
>> h<-function(dotot){
>>    v<-emptyexpandlist
>>    for(i in 1:dotot){
>>      v<-addexpandlist(FALSE,v)
>>    }
>>    return(compressexpandlist(v))
>> }
>>
>> #########################################
>>
>> dotot=100000
>> system.time(f(dotot))
>> #system.time(g(dotot))
>> system.time(h(dotot))
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 17, 2012 at 8:42 PM, Patrick Burns <pburns at pburns.seanet.com
>> <mailto:pburns at pburns.seanet.com>> wrote:
>>
>>     Johan,
>>
>>     If you don't know 'The R Inferno', it might
>>     help a little.  Circle 2 has an example of
>>     how to efficiently (relatively speaking) grow
>>     an object if you don't know the final length.
>>
>>     http://www.burns-stat.com/__pages/Tutor/R_inferno.pdf
>>     <http://www.burns-stat.com/pages/Tutor/R_inferno.pdf>
>>
>>     If you gave a simple example of how your code
>>     looks now and what you want it to do, then you
>>     might get some ideas of how to improve it.
>>
>>
>>     Pat
>>
>>
>>     On 17/07/2012 12:47, Johan Henriksson wrote:
>>
>>         Hello!
>>         I am optimizing my code in R and for this I need to know a bit
>>         more about
>>         the internals. It would help tremendously if someone could link
>>         me to a
>>         page with O()-complexities of all the operations.
>>
>>         In this particular case, I need something like a linked list
>>         with O(1)
>>         insertLast/First ability. I can't preallocate a vector since I
>>         do not know
>>         the final size of the list ahead of time.
>>
>>         The classic array-doubling trick would give me O(1) amortized
>>         time but it's
>>         a bit messy. The other alternative I see would be to recursively
>>         store
>>         lists (elem, (elem, (elem, (...)))), which I take also would
>>         work? But I'd
>>         rather go for a standard R solution if there is one!
>>
>>         cheers,
>>         /Johan
>>
>>
>>     --
>>     Patrick Burns
>>     pburns at pburns.seanet.com <mailto:pburns at pburns.seanet.com>
>>     twitter: @portfolioprobe
>>     http://www.portfolioprobe.com/__blog
>>     <http://www.portfolioprobe.com/blog>
>>     http://www.burns-stat.com
>>     (home of 'Some hints for the R beginner'
>>     and 'The R Inferno')
>>
>>
>>
>>
>>
>> --
>> --
>> -----------------------------------------------------------
>> Johan Henriksson, PhD
>> Karolinska Institutet
>> Ecobima AB - Custom solutions for life sciences
>> http://www.ecobima.se <http://www.ecobima.com> http://mahogny.areta.org
>> http://www.endrov.net
>>
>> <http://www.endrov.net>
>