[R] manova discriminant functions: Addendum

Fri Feb 23 01:19:21 CET 2007

Three weeks later, I have almost completely solved my problem (quoted below;
about within-subjects manova, and discriminant function analysis to
compliment a manova analysis). So for anyone who was secretly hoping someone
would respond to me:

* manova does not handle within-subjects variables in a manner which is very
   intuitive for those not in the field of statistics or mathematics.  The
   anova.mlm functions allow this sort of analysis, but for applied
   statisticians (read: social scientists), dealing with inner vs outer
   projections is a bit opaque.

* However, a bit of knowledge on what repeated measures MANOVA really does
   is very helpful. (No, really, it helps to know what you're trying to do!)
   The general idea is that you create contrast columns for the
   within-subjects factors, such that the contrast columns are orthogonal.
   Then, you submit these to a general MANOVA with between-subject predictors
   to test the within-between interactions.

* For within-subjects main effects, it's easier (for me) to just compute the
   hypothesis matrix myself. With orthogonal contrast factors, it's pretty
   simple: each row of the hypothesis matrix is the product of the column of
   means for the contrast columns times the mean for that row's contrast,
   times the number of observations. The error matrix can be taken directly
   from the fit for the interaction (see point 2); then, H %*% ginv(E) gives
   the effect matrix for the within subjects main effect, whose eigenvalues
   can be evaluated with the appropriate degrees of freedom. I wrote a
   function to do this, if anyone is interested.

* The current implementation of the Pillai() function uses Pillai's (1954)
   approximation of an F value for a given V, despite the much more robust
   and (seemingly) uncontroversial method of estimating F put forth by Muller
   in 1998. This is, of course, what SPSS and SAS do (unless you require SAS
   to do "exact" tests, which is still a misnomer), but it doesn't seem to me
   like R ought to do things poorly just to give the same results as SPSS.

* Estimating several within-subjects effects basically amounts to providing
   the relevant contrast matrices to the interaction analyses, and then
   testing subsets relative to the overall residual matrix and the right
   number of df.

* To get discriminant functions and variable loadings, use the lda()
   function. The fact that these are provided by manova() analogues in other
   software packages was just a red herring!

* However, this also does not work for discovering within-subjects
   discriminant functions. And that, I'm afraid, is where I'm now stuck...any
   suggestions on an lda() analogue for within-subject factors? An example:

data <- data.frame(treat=factor(rep(c("Control","Treatment"),3)),
                    time1=c(1,5,1,5,1,5),
                    time2=c(2,8,1,7,2,6),
                    time3=c(3,9,3,9,2,10))

data.poly <- data[,2:4] %*% contr.poly(3)

Now, discriminant functions on predicting the difference between Control and
Treatment, using the three timepoints is easy: Convert them to orthogonal
contrast variables, and then see how to discriminate treatment conditions
based on the linear and quadratic effects of time:

lda(data$treat ~ data.poly)

...but my question is how to do an lda predicting whether an observation
came from the time1, time2, or time3 column.

Thanks,
Adam

On Mon, 5 Feb 2007, Adam D. I. Kramer wrote:

> As an addendum to my earlier post, I am having another difficulty with
> getting manova() to behave as I would like: when I specify contrasts for my
> independent variable(s), I am unsure of how to test them. This is a
> contrived example of both of my questions:
>
> Assume three alertness measurements, "alert1" "alert2" "alert3", a
> within-subjects variable measuring their alertness at three timepoints,
> minutes after taking the drug (alert1), one hour (alert2), and two hours
> (alert3). The between-subjects variable is dosage, with dose==0 when
> subjects have had no drug, dose==1 when they have had a single dose, dose==2
> when they have had a double dose.
>
> My intuition says to do the following:
>
> alert <- cbind(alert1,alert2,alert3) %*% contr.poly(3)
> contrasts(dose) <- matrix(c(2,-1,-1,0,1,-1),3,2)
> m <- manova(alert ~ dose)
>
> ...what I want is two main effects (dose and alert) and one interaction
> (dose by alert), but also "main effect" and "interaction" for the two
> individual contrasts for dose. For the main effect for alert, and all of the
> dose*alert interactions, I need the discriminant function loadings of my two
> alertness contrasts in order to interpret the manner in which alertness is
> varying (e.g., is it varying in a linear or quadratic way).
>
> m2 <- manova (alert ~ dose)
> summary(m2)
> ...gives me a test for the dose * alertness interaction. Good! But I can't
> seem to find the contrasts I asked for for dose. In univariate ANOVA, I
> usually just call summary.lm() which gives me t-test coefficients for each
> level of the dose contrast...but calling summary.lm on a manova object
> returns t-tests on three unnamed coefficients, with 27 error degrees of
> freedom (when it should be 26, as I am intending to compute both dosage
> contrasts simultaneously). Also, I cannot tell whether it is the linear or
> quadratic contrast that is contributing to the differentiation of dosage
> levels--this is why I need the discriminant function loadings.
>
> m2 <- lm( apply(cbind(alert1,alert2,alert3),1,mean) ~ dose)
> summary(m2)
> ...gives me a test for the two contrasts, which can be pooled to get a main
> effect. Excellent!
>
> ...finally, for the main effect of alertness, I'm more or less at a loss.
> The question is whether the three alertness conditions differ from each
> other, or whether some linear combination of the linear and quadratic
> contrast columns is significantly different from zero...and then the
> relative weightings of the linear and quadratic contrasts.
>
> Any suggestions?
>
> Thanks,
> Adam
>
> On Mon, 5 Feb 2007, Adam D. I. Kramer wrote:
>
>> Hello,
>>
>> 	I've been playing with the manova() function to do some pretty
>> straightforward multivariate analyses, and I can't for the life of me 
>> figure
>> out how to get at the discriminant functions used. When predicting several
>> variables simultaneously, it's important to be able to gauge how much each
>> variable is contributing to the analysis...a simple p-value isn't really
>> enough. I find examination of the discriminant function loadings to be a
>> good indicator of this.
>> 
>> Thanks,
>> Adam Kramer
>> 
>