[R] memory problem for R

Spencer Graves spencer.graves at pdf.com
Fri Jan 30 13:28:04 CET 2004


      Hello, Yun-Fan: 

      Prof. Ripley's comments will get you started.  Part of the key is 
finding informative ways to subset and summarize the data so you don't 
try to read it all into R at once.  You can read segments using 
arguments "skip" and "nrows" in "read.table".  You can then analyze a 
portion, save a summary, discard the bulk of the data and read another 
portion. 
    
      Beyond this, you may know that Kalman filtering is essentially 
linear regression performed one observation or one group of observations 
at a time, downweighting "older" observations gracefully.  It 
essentially assumes that the regression parameters follow a random walk 
between observations or groups of observations.  I've done ordinary 
least squares with Kalman filtering software one observation at a time, 
just by setting the migration variance to zero.  R software for Kalman 
filtering was discussed recently in this list;  to find it, I would use 
the search facilities described in the posting guide at the end of every 
r-help email. 

      hope this helps. 
      spencer graves

Prof Brian Ripley wrote:

>On Thu, 29 Jan 2004, Yun-Fang Juan wrote:
>
>  
>
>>Here is the exact error I got
>>----------------------
>>Read 73 items
>>Error: cannot allocate vector of size 1953 Kb
>>Execution halted
>>-----------------------
>>I am running R on Freebsd 4.3
>>with double CPU and 2 GB memory
>>Is that sufficient?
>>    
>>
>
>Clearly not.  What is the structure of your `attributes'?  As Andy Liaw
>said, the design matrix may be bigger than that if there are factors
>involved.  (And you need several copies of the design matrix.)
>
>I would try a 10% sample of the rows to get a measure of what will fit
>into your memory.  I have never seen a regression problem for which 600k
>cases were needed, and would be interested to know the context.  (It is
>hard to imagine that the cases are from a single homogeneous population
>and that a linear model fits so well that the random error is not 
>dominated by systematic error.)
>
>  
>
>>Yun-Fang
>>----- Original Message -----
>>From: "Yun-Fang Juan" <yunfang at yahoo-inc.com>
>>To: <r-help at stat.math.ethz.ch>
>>Sent: Thursday, January 29, 2004 7:03 PM
>>Subject: [R] memory problem for R
>>
>>
>>    
>>
>>>Hi,
>>>I try to use lm to fit a linear model with 600k rows and 70 attributes.
>>>But I can't even load the data into the R environment.
>>>The error message says the vector memory is used up.
>>>
>>>Is there anyone having experience with large datasets in R? (I bet)
>>>
>>>Please advise.
>>>
>>>
>>>thanks,
>>>
>>>
>>>Yun-Fang
>>>
>>>[[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide!
>>>      
>>>
>>http://www.R-project.org/posting-guide.html
>>    
>>
>>>      
>>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>>
>>    
>>
>
>  
>




More information about the R-help mailing list