[R] multistratum glm?

Wed Apr 21 23:49:03 CEST 2004

On 18 Apr 2004 at 13:47, Christophe Pallier wrote:

You should probably look into glmmPQL (package MASS)
or GLMM (package lme4).

Kjetil Halvorsen

> Hello,
> 
> I routinely use aov and and the Error term to perform analyses of
> variance of experiments with 'within-subject' factors. I wonder
> whether a notion like 'multistratum models' exists for glm models 
when
> performing a logit analysis (without being 100%  sure whether this
> would make sense).
> 
> I have data of an experiment where the outcome is a categorical
> variable:
> 
> 20 individuals listened to 80 synthetic utterances (distributed in 
4
> types) and were ask classify them into four categories. (The 
variables
> in the data.frame are 'subject', 'sentence', 'type', and 
'response')
> 
> Here is the table of counts table(type,response):
> 
>        response
> type  a   b  c   d
>   a 181 166 42  11
>   b  69 170 72  89
>   c  90 174 75  61
>   d  14 125 53 208
> 
> 
> There are several questions of interest, such as, for example:
> 
> - are responses distibuted in the same way for the different types?
> 
> - are the numbers of 'a' responses for the 'b' and 'c' types 
> significantly different?
> 
> - is the proportion of 'd' over 'a' responses different for the 'b'
> and 'c'  categories?
> 
> ...  
> 
> (I want to make inferences for the population of potential subjects 
on
> the one hand, and on the population of potential sentences on the
> other hand).
> 
> If the responses were continuous, I would just run two one-way 
anovas:
> one with the factor type over the means by subject*type, and the 
other
> with the factor type over the means by sentences (in type). And use
> t.test to compare between different pairs of types.
> 
> Now, as the answers are categorical, I am not sure about the 
correct
> approach and how to use R to perform such an analysis.
> 
> I could treat response as a factor, and use percentages of 
responses
> per subject in each cell of response*type, and run an anova on
> that...[ 
aov(percentage~response*type+Error(subject/(response*type))]
> But it seems incorrect to me to use the response of the subject as 
an
> independent variable (though I do not have a forceful argument).
> 
> Simple Chi-square tests are not the answer either, as a given 
subject
> contributed several times (80) to the counts in the table above.
> 
> My reading of MASS and of several other books suggest the use of
> logit/multinomial models when the response is categorical. But in 
all
> the examples provided, the units of analysis contribute only one
> measurement. Should I include the subject and sentences factors in 
the
> formula? But then they would be treated as fixed-factors in the
> analysis, would they not?
> 
> 
> Any suggestion is welcome.
> 
> Christophe Pallier
> www.pallier.org
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html