[R] GLM / large dataset question

Matthew Dowle mdowle at mdowle.plus.com
Wed Mar 31 19:26:09 CEST 2010


Geelman,

This appears to be your first post to this list. Welcome to R. Nearly 2 days 
is quite a long time to wait though, so you are unlikely to get a reply now.

Feedback : the question seems quite vague and imprecise. It depends on which 
R you mean (32bit/64bit) and how much ram you have.  It also depends on your 
data and what you want to do with it. Did you mean 100.000 (i.e. one 
hundred) or 100,000.  Also, '8000 explanatory variables' seems a lot, 
especially to be stored in 'a factor'.  There is no R code in your post so 
we can't tell if you're using glm correctly or not.  You could provide the 
result of object.size(), and dim() on your data rather than explaining it in 
words.

No reply often, but not always, means you haven't followed some detail of 
the posting guide or haven't followed this : 
http://www.catb.org/~esr/faqs/smart-questions.html.

HTH
Matthew

"geelman" <geelman at zonnet.nl> wrote in message 
news:MKEDKCMIMCMGOHIDFFMBIEKLCAAA.geelman at zonnet.nl...
> LS,
>
> How large a dataset can glm fit with a binomial link function?  I have a 
> set
> of about 100.000 observations and about 8000 explanatory variables (a 
> factor
> with 8000 levels).
>
> Is there a way to find out how large datasets R can handle in general?
>
>
>
> Thanks in advance,
>
>
> geelman
>



More information about the R-help mailing list