[R] Nested Fixed Effects - basic questions

Fri Sep 4 22:37:14 CEST 2009

In R and experimental or mixed-model terminology, your lm model specifies
fixed effects. As long as each data row represents a unique subject, you are
fine with lm. If not, you have to account for the repeated measurement of
subjects and will need other methods (potentially involving random effects).
In your model, you perform a dummy variable OLS (ordinary least squares)
regression. Mixed-effects models that allow for a combination of fixed and
random effects or random-effects-only analyses are most prominently done
with Doug Bates's nlme or lme4 libraries (though, there are more libraries
that allow for mixed-effects modeling). Google for some manuals.

Further, your effects are not nested. If each row stands for a different
unit of observation (e.g., subject), and if subjects are randomized into
treatments fix1, fix2, and fix3, then you have a completely randomized
factorial design (CRF). Nesting would imply something like students nested
in class nested in school, where each student is only member of one class
and each class only member of one school. Then your fix columns should look
like (with 8 students nested in 4 classes nested in 2 schools):

fix1 fix2 fix3
1    1    1
2    1    1
3    2    1
4    2    1
5    3    2
6    3    2
7    4    2
8    4    2

Thus, your effects are really not nested (at least not for what you show us
as the data). What you can do to figure out whether not only fix1 and fix2
have an independent effect, but also whether fix1 and fix2 interact in their
effect on your response, you can include interaction effects. However, if
the data you provided is your entire dataset, you will likely overfit the
model and inflate the standard errors if you include all possible
interactions (eats up 4 degrees of freedom) along with the direct effects
and intercept (also 4 degrees of freedom), given that your provided data has
only 16 observations.

Example:
#Simulate data
fix1=rep(0:1,each=8)
fix2=rep(c(0,0,1,1),4)
fix3=rep(0:1,8)
e=rnorm(16)

#Dependent variable
y=-1*fix1+2*fix2+1*fix3-0.75*fix1*fix2+0.9*fix1*fix3-2*fix2*fix3+1.5*fix1*fi
x2*fix3+e

#Run regression and show output
reg0=lm(y~(fix1+fix2+fix3)^3) #all interactions up to three-way
summary(reg0) 
#note that this is not very insightful with so few observations

#Same as above, just with a 10-times larger simulated dataset
fix1=rep(0:1,each=80)
fix2=rep(c(0,0,1,1),40)
fix3=rep(0:1,80)
e=rnorm(160)
y=-1*fix1+2*fix2+1*fix3-0.75*fix1*fix2+0.9*fix1*fix3-2*fix2*fix3+1.5*fix1*fi
x2*fix3+e
reg1=lm(y~(fix1+fix2+fix3)^3)
summary(reg1) 
#160 observations works quite well already
#coef estimates are within the margin of error of the true coefficients 

The second example shows that the approach to use OLS to model your data is
fine if your error distribution (the distribution of e in the simulated
data) is normal.

Daniel

-------------------------
cuncta stricte discussurus
-------------------------

-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Jojo Ziggy
Gesendet: Friday, September 04, 2009 1:17 PM
An: r-help at r-project.org
Betreff: [R] Nested Fixed Effects - basic questions

Hi R people,

I have a very basic question to ask - I'm sorry if it's been asked before,
but I searched the archives and could not find an answer.  All the examples
I found were much more complicated/nuanced versions of the problem - my
question is much more simple.

I have data with multiple, nested fixed effects (as I understand it, fixed
effects are specified by the experimental design while random effects are
measured) and one continuous response variable.  All the fixed effects are
catagorical.  

e.g.
fix1    fix2    fix3    response
0    0    0    16.260
0    0    0    16.128
0    0    1    22.969
0    0    1    23.245
0    1    0    14.687
0    1    0    14.635
0    1    1    22.954
0    1    1    23.345
1    0    0    19.866
1    0    0    19.589
1    0    1    22.748
1    0    1    22.817
1    1    0    17.861
1    1    0    17.872
1    1    1    22.925
1    1    1    23.138

I was thinking I could use a linear model to determine whether any of the
nested fixed effects or their interactions effect the response, but I could
not determine how to specify whether effects were fixed or random, and how
to specify nesting.  

For example:
lm(response~ fix1+fix2+fix3)

The above, as I understand it, simply asks whether the effects fix1 through
fix4 have an effect on the response.  However, in reality my experimental
design has multiple levels of nesting:

fix1(fix2(fix3(fix4)))

So, how do I do this?  To specify nesting, do I need to use another type of
model such as lmer or glm? 

I also don't know whether the above example is specifying whether the
effects are fixed or random - how do I do this?

Thanks very much,
Jojo

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.