[R] Big data (over 2GB) and lmer

Ben Bolker bbolker at gmail.com
Thu Oct 21 21:00:46 CEST 2010


Michal Figurski <figurski <at> mail.med.upenn.edu> writes:

> I have a data set of roughly 10 million records, 7 columns. It has only 
> about 500MB as a csv, so it fits in the memory. It's painfully slow to 
> do anything with it, but it's possible. I also have another dataset of 
> covariates that I would like to explore - with about 4GB of data...
> 
> I would like to merge the two datasets and use lmer to build a mixed 
> effects model. Is there a way, for example using 'bigmemory' or 'ff', or 
> any other trick, to enable lmer to work on this data set?

   I don't think this will be easy.

   Do you really need mixed effects for this task?  i.e., are
your numbers per group sufficiently small that you will benefit
from the shrinkage etc. afforded by mixed models?  If you have
(say) 10000 individuals per group, 1000 groups, then I would
expect you'd get very accurate estimates of the group coefficients,
you could then calculate variances etc. among these estimates.

   You might get more informed answers on r-sig-mixed-models at r-project.org ...

  Ben Bolker



More information about the R-help mailing list