[R] poor rbind performance

Tony Plate tplate at acm.org
Wed Jul 18 19:32:55 CEST 2007


As Jim points out, building up a data frame by rbinding in a loop can be 
a slow way to do things in R.

Here's an example of how you can easily read data frames into a list:

 > # Create 3 files
 > invisible(lapply(1:3, function(i) 
write.csv(file=paste("tmp",i,".csv",sep=""), 
data.frame(i=2*i+(1:2),c=letters[2*i+(1:2)]))))
 > # Read the files into a list of data frames
 > list.of.dfs <- lapply(paste("tmp",1:3,".csv",sep=""), read.csv, 
row.names=1)
 > # rbind the data frames
 > myData <- do.call("rbind", list.of.dfs)
 > myData
   i c
1 3 c
2 4 d
3 5 e
4 6 f
5 7 g
6 8 h
 >

(and of course, these last two expressions can be composed into a single 
expression if you want)

-- Tony Plate

Aydemir, Zava (FID) wrote:
> Hi
>  
> I rbind data frames in a loop in a cumulative way and the performance
> detriorates very quickly. 
>  
> My code looks like this:
>  
> for( k in 1:N)
> {
>     filename <- paste("/tmp/myData_",as.character(k),".txt",sep="")
>     myDataTmp <- read.table(filename,header=TRUE,sep=",")
>     if( k == 1) {
>         myData <- myDataTmp
>     }
>     else{
>         myData <- rbind(myData,myDataTmp)
>     }  
> }
>  
> Some more details:
> - the size of the stored text files is about 100,000 rows and 50 columns
> each
> - for k=1: rbind takes 0.0004 seconds
> - for k=2: rbind takes 13 seconds
> - for k=3: rbind takes 30 seconds
> - for k=4: rbind takes 36 seconds
> etc
>  
> Any suggestions to improve speed?
>  
> Thanks
>  
> Zava
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list