[R] memory problem for R --Summary

Yun-Fang Juan yunfang at yahoo-inc.com
Mon Feb 2 20:59:21 CET 2004


Thank you very much for the replies you have sent me regarding the memory
problem.
The following is the summary
(I tried to read all the messages through. I apologized if I overlooked your
message)

Cheers,

Yun-Fang
----------------------------
Backgrounds:
a. Data: 1million rows with 73 numeric attributes
b. Environment: R 1.7.1 on FreeBSD 4.3 with  2GB memory and double CPU
   Pentium III/Pentium III Xeon/Celeron
    with  data seg size (kbytes) =1572864  limit

Suggested Solutions:
z. use SAS since SAS is not trying to read all the data into RAM.
a. random sampling from the large data set i.e. 10% of 1 million rows
    (the option singular.ok=TRUE can be used in lm for singular matrice.)
b. use kalman filter with migration variance =0. ( see the dse package for
details)
c. add the following configuration: options(object.size=1e8)
   Results:  still OOM
d. if data is all numeric, add colClasses="numeric" in read.table()
   Results: read.table read in the data successfully but I failed to access
the dataset after the loading
(even dataset[1:10,] didn't work)

----- Original Message -----
From: "Liaw, Andy" <andy_liaw at merck.com>
To: "'Yun-Fang Juan'" <yunfang at yahoo-inc.com>; "Prof Brian Ripley"
<ripley at stats.ox.ac.uk>
Cc: <r-help at stat.math.ethz.ch>
Sent: Friday, January 30, 2004 11:44 AM
Subject: RE: [R] memory problem for R


> You still have not read the posting guide, have you?
>
> See more below.
>
> > From: Yun-Fang Juan
>
> [...]
>
> > I tried 10% sample and it turned out the matrix became
> > singular after I did that.
> > Ther reason is some of the attributes only have zero values
> > most of the time.
> > The data i am using is web log data and after some
> > transformation, they are all numeric.
> > Can we specify some parameters in read.table so that the
> > program will treat all the vars as numeric
> > (with this context, hopefully that will reduce the memory
> > consumption)  ?
>
> and you clearly have not read my (private) reply, either, in which I told
> you *exactly* how to do that, via the colClasses argument to read.table().
>
> Please take the help given to you seriously.  If you want attention, you
> have to pay attention.
>
> Andy
>
> > thanks a lot,
> >
> > Yun-Fang
>
>
> --------------------------------------------------------------------------
----
> Notice:  This e-mail message, together with any attachments, contains
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New
> Jersey, USA 08889), and/or its affiliates (which may be known outside the
> United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> Banyu) that may be confidential, proprietary copyrighted and/or legally
> privileged. It is intended solely for the use of the individual or entity
> named on this message.  If you are not the intended recipient, and have
> received this message in error, please notify us immediately by reply
e-mail
> and then delete it from your system.
> --------------------------------------------------------------------------
----
>
>




More information about the R-help mailing list