[R] Speeding indexing and sub-sectioning of 3d array

Swidan, Firas swidanf at janelia.hhmi.org
Thu Aug 10 20:34:27 CEST 2006


Hi Patrick,

Thanks for the help. The function I listed is just an example. I isolated
and kept only the problematic part in my code for clarity sake. I ended up
implementing the functionality in C and now it takes 22 seconds to calculate
the objective.

Best regards,
Firas.

On 8/9/06 4:41 PM, "Patrick Burns" <pburns at pburns.seanet.com> wrote:

> First off, I hope that the function you list is just an example
> since it only returns what the last iteration does -- obviously
> the same answer can be arrived at much quicker.
> 
> The main principal in speeding up loops is to do as little
> inside the loops as possible.  'fjj1' is essentially the same as
> the listed function, but with one slight cleanup.
> 
> fjj1 <-
> function(x, radius=3)
> {
>         dx <- dim(x)
>         dx1 <- dx[1]
>         dx2 <- dx[2]
>         dx3 <- dx[3]
>         for(i in (radius + 1):(dx1 - radius - 1)) {
>                 for(j in (radius + 1):(dx2 - radius - 1)) {
>                         for(k in (radius + 1):(dx3 - radius -1)) {
>                                 ans <- mean(x[(i-radius):(i+radius),
>                                         (j-radius):(j+radius),
> (k-radius):(k+radius)])
>                         }
>                 }
>         }
>         ans
> }
> 
> The time to run fjj1(jj, 3) on my machine where jj is a
> 245 by 175 by 150 array was 1222 seconds.
> 
> 'fjj2'  substantially reduces the number of sequences
> created.  It took 975 seconds.
> 
> fjj2 <-
> function(x, radius=3)
> {
>         dx <- dim(x)
>         dx1 <- dx[1]
>         dx2 <- dx[2]
>         dx3 <- dx[3]
>         rseq <- -radius:radius
>         for(i in (radius + 1):(dx1 - radius - 1)) {
>                 for(j in (radius + 1):(dx2 - radius - 1)) {
>                         for(k in (radius + 1):(dx3 - radius -1)) {
>                                 ans <- mean(x[i + rseq, j + rseq, k + rseq])
>                         }
>                 }
>         }
>         ans
> }
> 
> 
> 'fjj3' reduces some of the subscripting (but possibly at the
> expense of using more memory -- I'm not sure if it does or
> not).  It took 936 seconds.
> 
> fjj3 <-
> function(x, radius=3)
> {
>         dx <- dim(x)
>         dx1 <- dx[1]
>         dx2 <- dx[2]
>         dx3 <- dx[3]
>         rseq <- -radius:radius
>         for(i in (radius + 1):(dx1 - radius - 1)) {
>                 A <- x[i + rseq, , , drop=FALSE]
>                 for(j in (radius + 1):(dx2 - radius - 1)) {
>                         B <- A[, j + rseq, , drop=FALSE]
>                         for(k in (radius + 1):(dx3 - radius -1)) {
>                                 ans <- mean(B[ , , k + rseq])
>                         }
>                 }
>         }
>         ans
> }
> 
> 'fjj4' reverses the order of the loops.  Because of the
> way that arrays are stored, it makes sense that subscripting
> a sequence in the first dimension would be faster than
> subscripting subsequent dimensions.  This does seem to be
> the case.  'fjj4' took 839 seconds.
> 
> fjj4 <-
> function(x, radius=3)
> {
>         dx <- dim(x)
>         dx1 <- dx[1]
>         dx2 <- dx[2]
>         dx3 <- dx[3]
>         rseq <- -radius:radius
>         for(i in (radius + 1):(dx3 - radius - 1)) {
>                 A <- x[, ,i + rseq, drop=FALSE]
>                 for(j in (radius + 1):(dx2 - radius - 1)) {
>                         B <- A[, j + rseq, , drop=FALSE]
>                         for(k in (radius + 1):(dx1 - radius -1)) {
>                                 ans <- mean(B[k + rseq, , ])
>                         }
>                 }
>         }
>         ans
> }
> 
> 
> Another change that would make a marginal difference
> would be to generate the sequences controlling the inner
> loops once at the outset.
> 
> If the computation at the heart of the function really is a
> mean or something similar, then it is possible that there
> will be tricks to update that value more efficiently.
> 
> Finally, if this will be used enough that the speed is an
> issue, then rewriting it in C would be a good approach.
> 
> 
> Patrick Burns
> patrick at burns-stat.com
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
> 
> Swidan, Firas wrote:
> 
>> Hi,
>> 
>> I am having a problem with a very slow indexing and sub-sectioning of a 3d
>> array:
>> 
>>  
>> 
>>> dim(arr)
>>>    
>>> 
>> [1] 245 175 150
>> 
>> For each point in the array, I am trying to calculate the mean of the values
>> in its surrounding:
>> 
>> 
>> mean( arr[ (i - radius):(i + radius),
>>                                (j - radius):(j + radius),
>>                                (k - radius):(k + radius)] )
>> 
>> Putting that code in 3 for loops
>> 
>> calculateKMedian <- function( arr, radius){
>> 
>>  for( i in (radius + 1):(dim(arr)[1] - radius - 1) ){
>>    for( j in (radius + 1):(dim(arr)[2] - radius - 1) )
>>      for( k in (radius + 1):(dim(arr)[3] - radius - 1) ){
>> 
>> 
>>        mediansArr <- mean( arr[ (i - radius):(i + radius),
>>                                (j - radius):(j + radius),
>>                                (k - radius):(k + radius)] )
>> 
>>      }
>>  }
>>  return(mediansArr)
>> }
>> 
>> Results in a very slow run:
>> 
>>  
>> 
>>> system.time( calculateKMedian( a, 3))
>>>    
>>> 
>> [1] 423.468   0.096 423.631   0.000   0.000
>> 
>> If I replace 
>> 
>>        mediansArr <- mean( arr[ (i - radius):(i + radius),
>>                                (j - radius):(j + radius),
>>                                (k - radius):(k + radius)] )
>> 
>> With an access to the (I,j,k) cell's value
>> 
>> mediansArr <- arr[i,j,k]
>> 
>> The running time decreases to
>> 
>>  
>> 
>>> system.time( calculateKMedian( a, 3))
>>>    
>>> 
>> [1] 14.821  0.005 14.829  0.000  0.000
>> 
>> 
>> 
>> But 14 seconds are still pretty expensive for just scanning the array.
>> 
>> Is there anything I can do to speed the indexing and the sub-sectioning of
>> the 3d array in this case?
>> 
>> Thanks for the help,
>> Firas.
>> 
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>>  
>>



More information about the R-help mailing list