[R] memory problem in exporting data frame

Henrik Bengtsson hb at maths.lth.se
Tue Sep 9 19:59:32 CEST 2003


> -----Original Message-----
> From: array chip [mailto:arrayprofile at yahoo.com] 
> Sent: den 9 september 2003 19:04
> To: Henrik Bengtsson; 'Thomas W Blackwell'; Patrick Burns
> Cc: r-help at stat.math.ethz.ch
> Subject: RE: [R] memory problem in exporting data frame
> 
> 
> Hi all,
> 
> Thanks for all the suggestions. I was able to get the
> data frame out by first deleting some other large
> objects in the directory, and then changing the data
> frame into matrix by as.matrix(), splitting the matrix
> into 4 blocks and finally using write.table() to write
> the matrix into 4 files to be combined later.

write.table() has an 'append' argument, which means that you do not have
to write your matrix into seperate files, but you can append each block
to the same file 
as you go. Make sure that you turn of the column names (col.names=FALSE)
when you append and that you control or turn off the row names
(row.names=FALSE).

Henrik

> Thanks again
> 
> --- Henrik Bengtsson <hb at maths.lth.se> wrote:
> > Hi, I replied to a related question yesterday (Mon
> > Sept 8, 2003) with
> > subject "RE: [R] cannot allocate vector of size...".
> > That was as also
> > about running low of memory, but about *reading*
> > data from file and not
> > writing. However, the problem is likely to be due to
> > the same thing.
> > 
> > You pass a large object to a function via an
> > argument, an argument which
> > is then changed inside the function (in your case
> > write.table() is doing
> > x <- as.matrix(x)). As long as the argument is only
> > read, R is clever
> > not to create a copy of it (pass by reference if
> > read-only), but as soon
> > as you change it, it is creating a local copy of it
> > (pass by value).
> > Hence, now you have your original 'xxx' object plus
> > a local copy
> > "inside" the function. This is likely to be your
> > problem.
> > 
> > You can do the work around that Patrick Burns
> > suggest and improve it
> > slightly, if can you do not need the 'xxx' variable
> > anymore, you can do
> > 'xxx <- as.matrix(xxx)'. A better approach, as you
> > suggest yourself,
> > except from doing it row by row, is to write your
> > dataframe block by
> > block with a reasonable block size. This can of
> > course be done using a
> > for loop and write.table(), but you will do better
> > if you look at the
> > code in write.table() and avoid the doing the same
> > overhead work in each
> > step.
> > 
> > Finally and FYI, you might be able to shrink your
> > original data frame by
> > considering the following
> > 
> > i <- as.integer(1:1000)
> > d <- as.double(i)
> > df1 <- data.frame(i=i, d=d)
> > df2 <- data.frame(i=i, d=i)
> > object.size(df1)  # 24392 bytes
> > object.size(df2)  # 20392 bytes
> > 
> > However, note that when doing x <- as.matrix(x) (as
> > write.table() does),
> > will coerce the data into *one* data type (because
> > it is a matrix). In
> > other words, the only thing you will gain is a
> > smaller 'xxx' object.
> > 
> > Best wishes
> > 
> > Henrik Bengtsson
> > Lund University
> > 
> > > -----Original Message-----
> > > From: r-help-bounces at stat.math.ethz.ch
> > > [mailto:r-help-bounces at stat.math.ethz.ch] On
> > Behalf Of Thomas
> > > W Blackwell
> > > Sent: den 9 september 2003 01:28
> > > To: array chip
> > > Cc: r-help at stat.math.ethz.ch
> > > Subject: Re: [R] memory problem in exporting data
> > frame
> > > 
> > > 
> > > 
> > > Simplest is to save your workspace using
> > save.image(),
> > > then delete a bunch of large objects other than
> > the data
> > > frame that you want to export, and run
> > write.table()
> > > again, now that you've made space for it.  A quick
> > calc
> > > shows  17000 x 400 x 8 = 55 Mb, and that's just
> > the size
> > > of the object that chokes R below.
> > > 
> > > -  tom blackwell  -  u michigan medical school  -
> > ann arbor  -
> > > 
> > > On Mon, 8 Sep 2003, array chip wrote:
> > > 
> > > > I am having trouble of exporting a large data
> > frame
> > > > out of R to be used in other purpose. The data
> > frame
> > > > is numeric with size 17000x400. It takes a quite
> > some
> > > > time to start R as well. my computer has 1GB
> > RAM. I
> > > > used the following command to write the data
> > frame to
> > > > a text file and got the error message below:
> > > >
> > > > > write.table(xxx, "C:\\xxx", sep="\t",
> > > > row.names=FALSE,col.names=FALSE,quote=FALSE)
> > > >
> > > > Error: cannot allocate vector of size 55750 Kb
> > > > In addition: Warning message:
> > > > Reached total allocation of 1023Mb: see
> > > > help(memory.size)
> > > >
> > > > I tried to increase the memory size by 
> memory.size(size=), but it 
> > > > seems running the
> > above
> > > > command takes forever.
> > > >
> > > > what can I do with this error message to get the
> > data
> > > > out?
> > > >
> > > > Thanks
> > > 
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://www.stat.math.ethz.ch/mailman/listinfo>
> > /r-help
> > > 
> > > 
> > 
> 
> 
> __________________________________
> Do you Yahoo!?

> software http://sitebuilder.yahoo.com
> 
>




More information about the R-help mailing list