[R] dividing a matrix by positive sum or negative sum depending on the sign

David Winsemius dwinsemius at comcast.net
Wed Nov 11 18:12:27 CET 2009


On Nov 11, 2009, at 10:57 AM, David Winsemius wrote:

>
> On Nov 11, 2009, at 10:36 AM, Dimitris Rizopoulos wrote:
>
>> one approach is the following:
>>
>> mat <- rbind(c(-1, -1, 2, NA), c(3, 3, -2, -1), c(1, 1, NA, -2))
>>
>> mat / ave(abs(mat), row(mat), sign(mat), FUN = sum)
>
> Very elegant. My solution was a bit more pedestrian, but may have  
> some speed advantage:
>



I am wondering if there might be further performance improvements if  
sums were pre-calculated before the ifelse scaling step.

Perhaps:
 > mat <- matrix(sample(-4:4, 100, replace=T), ncol=10)
 > system.time(replicate(10000, t(apply(mat, 1, function(x) {negs <- 
sum(x[x<0], na.rm=T); poss <- sum(x[x>0], na.rm=T); ifelse( x <0, -x/ 
negs, x/poss)} ) ) ) )
    user  system elapsed
   9.420   0.103   9.619
 > system.time(replicate(10000, t( apply(mat, 1, function(x) ifelse( x  
<0, -x/sum(x[x<0], na.rm=T), x/sum(x[x>0], na.rm=T) ) ) ) ) )
    user  system elapsed
   8.206   0.035   8.231

That was only a 15% improvement but I got a 50% improvement by  
replacing the ifelse() with its Boolean algebra equivalent:

 > t( apply(mat, 1, function(x) -x*(x <0)/sum(x[x<0], na.rm=T) +  
x*(x>0)/sum(x[x>0], na.rm=T) ) )
      [,1] [,2]       [,3]       [,4]
[1,] -0.5 -0.5  1.0000000         NA
[2,]  0.5  0.5 -0.6666667 -0.3333333
[3,]  0.5  0.5         NA -1.0000000


 > system.time(replicate(10000,  t( apply(mat, 1, function(x) -x*(x  
<0)/sum(x[x<0], na.rm=T) + x*(x>0)/sum(x[x>0], na.rm=T) ) ) ))
    user  system elapsed
   4.805   0.041   4.839

I could not figure out the Jeff's method of applying the two functions  
he presented, so I am unable to compare any of these methods to his  
strategy.

-- 
David.
>
>
> > system.time(replicate(10000, t( apply(mat, 1, function(x)  
> ifelse( x <0, -x/sum(x[x<0], na.rm=T), x/sum(x[x>0],  
> na.rm=T) ) ) ) ) )
>   user  system elapsed
>  5.958   0.027   5.977
>
> > system.time(replicate(10000, mat / ave(abs(mat), row(mat),  
> sign(mat), FUN = sum) ) )
>   user  system elapsed
> 12.886   0.064  12.886
>
> -- 
> David
>>
>>
>> I hope it helps.
>>
>> Best,
>> Dimitris
>>
>>
>> Hao Cen wrote:
>>> Hi,
>>> I have a matrix with positive numbers, negative numbers, and NAs. An
>>> example of the matrix is as follows
>>> -1 -1 2 NA
>>> 3 3 -2 -1
>>> 1 1 NA -2
>>> I need to compute a scaled version of this matrix. The scaling  
>>> method is
>>> dividing each positive numbers in each row by the sum of positive  
>>> numbers
>>> in that row and  dividing each negative numbers in each row by the  
>>> sum of
>>> absolute value of negative numbers in that row.
>>> So the resulting matrix would be
>>> -1/2 -1/2 2/2 NA
>>> 3/6 3/6 -2/3 -1/3
>>> 1/2 1/2 NA -2/2
>>> Is there an efficient way to do that in R? One way I am using is
>>> 1. rowSums for positive numbers in the matrix
>>> 2. rowSums for negative numbers in the matrix
>>> 3. sweep(mat, 1, posSumVec, posDivFun)
>>> 4. sweep(mat, 1, negSumVec, negDivFun)
>>> posDivFun = function(x,y) {
>>>       xPosId = x>0 & !is.na(x)
>>>       x[xPosId] = x[xPosId]/y[xPosId]
>>>       return(x)
>>> }
>>> negDivFun = function(x,y) {
>>>       xNegId = x<0 & !is.na(x)
>>>       x[xNegId] = -x[xNegId]/y[xNegId]
>>>       return(x)
>>> }
>>> It is not fast enough though. This scaling is to be applied to  
>>> large data
>>> sets repetitively. I would like to make it as fast as possible. Any
>>> thoughts on improving it would be appreciated.
>>> Thanks
>>> Jeff
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> -- 
>> Dimitris Rizopoulos
>> Assistant Professor
>> Department of Biostatistics
>> Erasmus University Medical Center
>>
>> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>> Tel: +31/(0)10/7043478
>> Fax: +31/(0)10/7043014
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list