[R] Memory issues on a 64-bit debian system (quantreg)

roger koenker rkoenker at uiuc.edu
Thu Jun 25 00:24:24 CEST 2009


my earlier comment is probably irrelevant since you are fitting only  
one qss component and have no other covariates.
A word of warning though when you go back to this on your new  machine  
-- you are almost surely going to want to specify
a large lambda for the qss component  in the rqss call.  The default  
of 1 is likely to produce something very very rough with
such a large dataset.


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoenker at uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801



On Jun 24, 2009, at 5:04 PM, Jonathan Greenberg wrote:

> Yep, its looking like a memory issue -- we have 6GB RAM and 1GB swap  
> -- I did notice that the analysis takes far less memory (and runs)  
> if I:
>
> tahoe_rq <-  
> rqss(ltbmu_4_stemsha_30m_exp.img~ltbmu_eto_annual_mm.img,tau=. 
> 99,data=boundary_data)
>   (which I assume fits a line to the quantiles)
> vs.
> tahoe_rq <-  
> rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=. 
> 99,data=boundary_data)
>   (which is fitting a spline)
>
> Unless anyone else has any hints as to whether or not I'm making a  
> mistake in my call (beyond randomly subsetting the data -- I'd like  
> to run the analysis on the full dataset to begin with) -- I'd like  
> to fit a spline to the upper 1% of the data, I'll just wait until my  
> new computer comes in next week which has more RAM.  Thanks!
>
> --j
>
>
> roger koenker wrote:
>> Jonathan,
>>
>> Take a look at the output of sessionInfo(), it should say x86-64 if  
>> you have a 64bit installation, or at least I think this is the case.
>>
>> Regarding rqss(),  my experience is that (usually) memory problems  
>> are due to the fact that early on the processing there is
>> a call to model.matrix()  which is supposed to create a design, aka  
>> X, matrix  for the problem.  This matrix is then coerced to
>> matrix.csr sparse format, but the dense form is often too big for  
>> the machine to cope with.  Ideally, someone would write an
>> R version of model.matrix that would permit building the matrix in  
>> sparse form from the get-go, but this is a non-trivial task.
>> (Or at least so it appeared to me when I looked into it a few years  
>> ago.)  An option is to roll your own X matrix:  take a smalller
>> version of the data, apply the formula, look at the structure of X  
>> and then try to make a sparse version of the full X matrix.
>> This is usually not that difficult, but "usually" is based on a  
>> rather small sample that may not be representative of your problems.
>>
>> Hope that this helps,
>>
>> Roger
>>
>> url:    www.econ.uiuc.edu/~roger            Roger Koenker
>> email    rkoenker at uiuc.edu            Department of Economics
>> vox:     217-333-4558                University of Illinois
>> fax:       217-244-6678                Urbana, IL 61801
>>
>>
>>
>> On Jun 24, 2009, at 4:07 PM, Jonathan Greenberg wrote:
>>
>>> Rers:
>>>
>>>  I installed R 2.9.0 from the Debian package manager on our amd64  
>>> system that currently has 6GB of RAM -- my first question is  
>>> whether this installation is a true 64-bit installation (should R  
>>> have access to > 4GB of RAM?)  I suspect so, because I was running  
>>> an rqss() (package quantreg, installed via install.packages() -- I  
>>> noticed it required a compilation of the source) and watched the  
>>> memory usage spike to 4.9GB (my input data contains > 500,000  
>>> samples).
>>>
>>>  With this said, after 30 mins or so of processing, I got the  
>>> following error:
>>>
>>> tahoe_rq <-  
>>> rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=. 
>>> 99,data=boundary_data)
>>> Error: cannot allocate vector of size 1.5 Gb
>>>
>>>  The dataset is a bit big (300mb or so), so I'm not providing it  
>>> unless necessary to solve this memory problem.
>>>
>>>  Thoughts?  Do I need to compile either the main R "by hand" or  
>>> the quantreg package?
>>>
>>> --j
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list