[R] apply on large arrays

Erich Neuwirth erich.neuwirth at univie.ac.at
Thu Feb 14 19:46:27 CET 2008


 > system.time({
+   tab2 <- tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
+ tab2[] <- 0
+ tab2[which(tab1 == 1, arr.ind = TRUE)] <- 1
+ tab3 <- rowSums(tab2)
+ })
    user  system elapsed
    3.17    0.99    4.17
 >
 > system.time({
+   tab4 <- rowSums(tab1 == 1)
+ })
    user  system elapsed
    1.02    0.18    1.20
 >


And yes,
the results were identical.


Bill.Venables at csiro.au wrote:
> Was the answer the same as the one you were getting with the original
> code?
> 
> How long did the original code take compared to these two versions?
> 
> Cheers,
> Bill V. 
> 
> 
> Bill Venables
> CSIRO Laboratories
> PO Box 120, Cleveland, 4163
> AUSTRALIA
> Office Phone (email preferred): +61 7 3826 7251
> Fax (if absolutely necessary):  +61 7 3826 7304
> Mobile:                         +61 4 8819 4402
> Home Phone:                     +61 7 3286 7700
> mailto:Bill.Venables at csiro.au
> http://www.cmis.csiro.au/bill.venables/ 
> 
> -----Original Message-----
> From: Erich Neuwirth [mailto:erich.neuwirth at univie.ac.at] 
> Sent: Thursday, 14 February 2008 5:08 PM
> To: Venables, Bill (CMIS, Cleveland)
> Subject: Re: [R] apply on large arrays
> 
> Thanks, this version is definitely faster than the first one.
> system.time gives 0.13 instead of 0.79 seconds.
> 
> 
> 
> Bill.Venables at csiro.au wrote:
>> Hmm.  I think this could be faster still:
>>
>> 	tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
>> 	tab3 <- rowSums(tab1 == 1)
>>
>> but check it...
>>
>> Bill Venables
>> CSIRO Laboratories
>> PO Box 120, Cleveland, 4163
>> AUSTRALIA
>> Office Phone (email preferred): +61 7 3826 7251
>> Fax (if absolutely necessary):  +61 7 3826 7304
>> Mobile:                         +61 4 8819 4402
>> Home Phone:                     +61 7 3286 7700
>> mailto:Bill.Venables at csiro.au
>> http://www.cmis.csiro.au/bill.venables/ 
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Venables, Bill (CMIS, Cleveland)
>> Sent: Thursday, 14 February 2008 10:30 AM
>> To: erich.neuwirth at univie.ac.at; r-help at stat.math.ethz.ch
>> Subject: Re: [R] apply on large arrays
>>
>> Your code is
>>
>>
>> 	tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
>> 	tab2 <- apply(tab1, 1:4, 
>> 			function(x) ifelse(sum(x) == 1, 1, 0))
>> 	tab3 <- apply(tab2, 1, sum)
>>
>> As far as I can see, step 2, (the problematic one), merely replaces
> any
>> entries in tab1 that are not equal to one by zeros.  I think this
> would
>> do the same job a bit faster:
>>
>> 	tab2 <- tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
>> 	tab2[] <- 0
>> 	tab2[which(tab1 == 1, arr.ind = TRUE)] <- 1
>> 	tab3 <- rowSums(tab2)
>>
>> If you don't need to keep tab1, you would make things even better by
>> removing it.
>>
>> Bill Venables.
>> 	
>>
>>
>>
>>
>> Bill Venables
>> CSIRO Laboratories
>> PO Box 120, Cleveland, 4163
>> AUSTRALIA
>> Office Phone (email preferred): +61 7 3826 7251
>> Fax (if absolutely necessary):  +61 7 3826 7304
>> Mobile:                         +61 4 8819 4402
>> Home Phone:                     +61 7 3286 7700
>> mailto:Bill.Venables at csiro.au
>> http://www.cmis.csiro.au/bill.venables/ 
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Erich Neuwirth
>> Sent: Thursday, 14 February 2008 9:52 AM
>> To: r-help
>> Subject: [R] apply on large arrays
>>
>> I have a big contingency table, approximately of size 60*2*500*500,
>> and I need to count the number of cells containing a count of 1 for
> each
>> of the factors values defining the first dimension.
>> Here is my attempt:
>>
>> tab1<-with(pisa1,table(CNT,GENDER,ISCOF,ISCOM))
>> tab2<-apply(tab1,1:4,function(x)ifelse(sum(x)==1,1,0))
>> tab3<-apply(tab2,1,sum)
>>
>> Computing tab2 is very slow.
>> Is there a faster and/or more elegant way of doing this?
> 

-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459



More information about the R-help mailing list