[R] MANOVA permutation testing

Fri Mar 16 17:59:28 CET 2007

On Fri, 2007-03-16 at 00:50 +0000, Tyler Smith wrote:
> Hi,
> 
> I've got a dataset with 7 variables for 8 different species. I'd like
> to test the null hypothesis of no difference among species for these
> variables. MANOVA seems like the appropriate test, but since I'm
> unsure of how well the data fit the assumptions of equal
> variance/covariance and multivariate normality, I want to use a
> permutation test. 
> 
> I've been through CRAN looking at packages boot, bootstrap, coin,
> permtest, but they all seem to be doing more than I need. Is the
> following code an appropriate way to test my hypothesis:
> 
> result.vect <- c()
> 
> for (i in 1:1000){
>   wilks <- summary.manova(manova(maxent~sample(max.spec)),
>                test="Wilks")$stats[1,2]
>   result.vect <- c(res.vect,wilks)
> }
> 
> maxent is the data, max.spec is a vector of species names. Comparing
> the result.vect with the wilks value for the unpermuted data suggests
> there are very significant differences among species -- but did I do
> this properly?
> 

Hi Tyler,

(without knowing more about your data) I think you are almost there, but
the R code can be made much more efficient.

When you create your result vector, is is of length 0. Each time you add
a result, R has to copy the current result object, enlarge it and so on.
This all takes a lot of time. Better to allocate storage first, then add
each result in turn be replacement. E.g.:

Using an example from ?summary.manova

tear <- c(6.5, 6.2, 5.8, 6.5, 6.5, 6.9, 7.2, 6.9, 6.1, 6.3,
               6.7, 6.6, 7.2, 7.1, 6.8, 7.1, 7.0, 7.2, 7.5, 7.6)
gloss <- c(9.5, 9.9, 9.6, 9.6, 9.2, 9.1, 10.0, 9.9, 9.5, 9.4,
                9.1, 9.3, 8.3, 8.4, 8.5, 9.2, 8.8, 9.7, 10.1, 9.2)
opacity <- c(4.4, 6.4, 3.0, 4.1, 0.8, 5.7, 2.0, 3.9, 1.9, 5.7,
                  2.8, 4.1, 3.8, 1.6, 3.4, 8.4, 5.2, 6.9, 2.7, 1.9)
Y <- cbind(tear, gloss, opacity)
rate <- factor(gl(2,10), labels=c("Low", "High"))

## define number of permutations
nperm <- 999
## allocate storage, here we want 999 + 1 for our observed stat
res <- numeric(nperm+1)
## do the loop - the seq(along ...) bit avoids certain problems
for(i in seq(along = res[-1])) {
## here we replace the ith value in the vector res with the stat
    res[i] <- summary(manova(Y ~ sample(rate)), 
                      test = "Wilks")$stats[1,2]
}
## now we append the observed stat onto the end of the result vector res
## we also store this in 'obs' for convenience
res[nperm+1] <- obs <- summary(manova(Y ~ rate), test =  
                               "Wilks")$stats[1,2]

## this is the permutation p-value - the proportion of the nperm
## permutations + 1 that are  greater than or equal to the 
## observed stat 'obs'
sum(res <= obs) / (nperm+1)

HTH,

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%