[R] rbind() overwriting data.frame()

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Mon Sep 6 17:24:25 CEST 2010


This will give a matrix with 0 rows.

data.frame(matrix(nrow = 0,  ncol = 22, dimnames = list(NULL,
LETTERS[1:22])))

But you should avoid growing dataframes is the final dataframe is going
to be large. You are very likely to get memory problems. It is much to
better to create a large enough dataframe and then overwrite the rows.
And it is faster too...

> nrows <- 2000
> ncols <- 22
> system.time({
+ tmp <- data.frame(matrix(nrow = 0,  ncol = ncols))
+ for(i in seq_len(nrows)){
+ tmp <- rbind(tmp, rnorm(ncols))
+ }
+ })
   user  system elapsed 
   7.83    0.02    7.86 
> system.time({
+ tmp <- data.frame(matrix(nrow = nrows,  ncol = ncols))
+ for(i in seq_len(nrows)){
+ tmp[i, ] <- rnorm(ncols)
+ }
+ })
   user  system elapsed 
   3.75    0.00    3.76 

#In this case an apply construction was even faster

> system.time({
+ tmp <- t(sapply(seq_len(nrows), function(i){
+ rnorm(ncols)
+ }))
+ })
   user  system elapsed 
   0.02    0.00    0.02 




------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

> -----Oorspronkelijk bericht-----
> Van: r-help-bounces op r-project.org 
> [mailto:r-help-bounces op r-project.org] Namens rajesh j
> Verzonden: maandag 6 september 2010 16:57
> Aan: r-help op r-project.org
> Onderwerp: [R] rbind() overwriting data.frame()
> 
> Hi,
> 
> first off, I wanna ask how do I declare a data.frame of 0 
> rows and n columns?
> 
> Coming to my problem,
> 
> I have a data.frame of 22 columns by dynamic rows which I 
> insert using rbind. The total number of rows could go upto 
> 2,00,000. The problem is that after about 800 or 900 get 
> inserted rbind starts overwriting the data.frame and I end up 
> with a total of 800-900 rows. What is up with that?
> The 22 columns are all strings each having about 10 characters
> --
> Rajesh.J
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.



More information about the R-help mailing list