[R] fixed effects

Liaw, Andy andy_liaw at merck.com
Tue Mar 28 04:46:17 CEST 2006


I guess you meant X has 20,000 levels (and 40 observations each)?  In that
case lm() will attempt to create a design matrix that's 8e5 by 2e4, and
that's unlikely to fit in the RAM.

It is very easy to compute by hand.  I'm using a smaller data size first to
check the result against summary(lm()), then the size you specified
(hopefully with more correct arithmetics...) to show that it's quite doable
even on a modest laptop:

> set.seed(1)
> y <- rnorm(80)
> x <- factor(rep(paste("x", 1:20, sep=""), 4))
> summary(lm(y ~ x))$r.squared
[1] 0.2555885
> rsq <- function(x, y) {
+     sse <- sum(ave(y, x, FUN=function(x) x - mean(x))^2)
+     sstotal <- var(y) * (length(y) - 1)
+     1 - sse / sstotal
+ }
> rsq(x, y)
[1] 0.2555885
> set.seed(1)
> y <- rnorm(8e5)
> x <- factor(rep(paste("x", 1:2e4, sep=""), 40))
> system.time(rsq(x, y))
[1] 1.99 0.03 2.06   NA   NA

Andy
 

From: ivo welch
> 
> dear R wizards:
> 
> X is factor with 20,000*20=800,000 observations of 20,000 factors. 
> I.e., each factor has 20 observations.  y is 800,000 normally 
> distributed data points.  I want to see how much R^2 the X 
> factors can provide.  Easy, right?
> 
> > lm ( y ~ X)
> and
> >  aov( y ~ X)
> Error: cannot allocate vector of size 3125000 Kb
> 
> is this computationally infeasible?  (I am not an expert, but 
> off-hand, I thought this can work as long as the X's are just 
> factors = fitted means.)
> 
> > help.search("fixed.effects");
> 
> fixed.effects(nlme)                 Extract Fixed Effects
> fixed.effects.lmList(nlme)          Extract lmList Fixed Effects
> lme(nlme)                           Linear Mixed-Effects Models
> lmeStruct(nlme)                     Linear Mixed-Effects Structure
> nlme(nlme)                          Nonlinear Mixed-Effects Models
> nlmeStruct(nlme)                    Nonlinear Mixed-Effects Structure
> 
> Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'.
> 
> ok---I want to read the fixed.effects help.  could the help 
> system tell me how to inspect these entries?
> 
> > help("fixed.effects")
> wrong
> > help.search("fixed.effects")
> two entries, as above.  nothing new.
> > ?fixed.effects
> wrong
> 
> eventually, it dawned on me that nlme in parens was not a 
> function argument, but the name of the package within which 
> fixed.effects lives.  Suggestion:  maybe a different notation 
> to name packages would be more intuitive in the help system.  
> yes, I know it now, but other novices may not.  even a colon 
> instead of a () may be more intuitive in this context.
> 
> > library(nlme); ?lme
> and then
> > lme(y ~ X)
> Error in getGroups.data.frame(dataMix, groups) :
>         Invalid formula for groups
> 
> 
> now I have to beg for some help.  ok, blatant free-riding.  
> the lme docs tells me it does the Laird and Ware model, but I 
> do not know this model.  the only two examples given at the 
> end of the lme help file seem to be similar to what I just 
> specified.  so, how do I execute a simple fixed-effects 
> model?  (later, I may want to add a variable Z that is a 
> continuous random variable.)  could someone please give me 
> one quick example?  help is, as always, highly appreciated.
> 
> sincerely,
> 
> /ivo welch
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list