[R] How to make the "apply" faster

Debasish Pai Mazumder pai1981 at gmail.com
Sun Jul 10 22:38:14 CEST 2016


Thanks for your response. It is faster than before but still very slow. Any
other suggestion ?
-Deb


On Sun, Jul 10, 2016 at 2:13 PM, William Dunlap <wdunlap at tibco.com> wrote:

> There is no need to test that a logical equals TRUE: 'logicalVector==TRUE'
> is the
> same as just 'logicalVector'.
>
> There is no need to convert logical vectors to numeric, since rle() works
> on both
> types.
>
> There is no need to use length(subset(x, logicalVector)) to count how many
> elements
> in logicalVector are TRUE, just use sum(logicalVector).
>
> There is no need to make a variable, 'ans', then immediately return it.
>
> Hence your
>
>     b[b == TRUE] = 1
>     y <- rle(b)
>     ans <- length(subset(y$lengths[y$values == 1], y$lengths[y$values ==
> 1] >= 2))
>     return(ans)
>
> could be replaced by
>
>     y <- rle(b)
>     sum(y$lengths[y$values] >= 2)
>
> This gives some speedup, mainly for long vectors, but I find it more
> understandable.
> E.g., if f1 is your original function and f2 has the above replacement I
> get:
>   > d <- -sin(1:10000+sqrt(1:4))
>   > system.time(for(i in 1:10000)f1(d,.3))
>      user  system elapsed
>      5.19    0.00    5.19
>   > system.time(for(i in 1:10000)f2(d,.3))
>      user  system elapsed
>      3.65    0.00    3.65
>   > c(f1(d,.3), f2(d,.3))
>   [1] 1492 1492
>   > length(d)
>   [1] 10000
>
> If it were my function, I would also get rid of the part that deals with
> the threshhold
> and direction of the inequality and tell the user to to use f(data <= 0.3)
> instead of
> f(data, .3, "below").  I would also make the spell length an argument
> instead of
> fixing it at 2.  E.g.
>
>    > f3 <- function (condition, spellLength = 2)
>    {
>        stopifnot(is.logical(condition), !anyNA(condition))
>        y <- rle(condition)
>        sum(y$lengths[y$values] >= spellLength)
>    }
>    > f3( d >= .3 )
>    [1] 1492
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sun, Jul 10, 2016 at 11:58 AM, Debasish Pai Mazumder <pai1981 at gmail.com
> > wrote:
>
>> Hi Everyone,
>> Thanks for your help. It works. I have similar problem when I am
>> calculating number of spell.
>> I am also calculation spell (definition: period of two or more days where
>> x
>> exceeds 70) using similar way:
>>
>> *new = apply(x,c(1,2,4),FUN=function(y) {fun.spell.deb(y, 70)})*
>>
>> where fun.spell.deb.R:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *## Calculate spell durationfun.spell.deb <- function(data, threshold = 1,
>> direction = c("above", "below")){  #coln <- grep(weather, names(data))#
>> var <- data[,8]  if(missing(direction)) {direction <- "above"}
>> if(direction=="below") {b <- (data <= threshold)} else  {b <- (data >=
>> threshold)}    b[b==TRUE] = 1  y <-rle(b)  ans
>> <-length(subset((y$lengths[y$values==1]), (y$lengths[y$values==1])>=2))
>> return(ans)}*
>>
>> Do you have any idea how to make the "apply" faster here?
>>
>> -Deb
>>
>>
>> On Sat, Jul 9, 2016 at 3:46 PM, Charles C. Berry <ccberry at ucsd.edu>
>> wrote:
>>
>> > On Sat, 9 Jul 2016, Debasish Pai Mazumder wrote:
>> >
>> > I have 4-dimension array x(lat,lon,time,var)
>> >>
>> >> I am using "apply" to calculate over time
>> >> new = apply(x,c(1,2,4),FUN=function(y) {length(which(y>=70))})
>> >>
>> >> This is very slow. Is there anyway make it faster?
>> >>
>> >
>> > If dim(x)[3] << prod(dim(x)[-3]),
>> >
>> > new <-  Reduce("+",lapply(1:dim(x)[3],function(z) x[,,z,]>=70))
>> >
>> > will be faster.
>> >
>> > However, if you can follow Peter Langfelder's suggestion to use rowSums,
>> > that would be best. Even using rowSums(aperm(x,c(1,2,4,3)>=70,dims=3)
>> and
>> > paying the price of aperm() might be better.
>> >
>> > Chuck
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list