[R] What is the formula of Pseudo-F statistic in capscale in vegan?

Tue Dec 17 10:10:26 CET 2013

Dear Kristen Ross,

Kristen Ross <guayabitogirl <at> gmail.com> writes:

>tion; and (3) the R code used for this analysis.  

Sorry that I have to remove most of your original message: gmane won't
allow me to post if I add too little compared to the cited text.

This is now wild guessing, since there is nothing I would be able to
reproduce, in particular as I cannot afford buying PRIMER licence and
cannot even see its manual. I have one guess, though. Look at the first
and last item of your table:
> 
> (1) Table
> 
>
> 
> PRIMER pseudo-F
> 
> R pseudo-F
> 
> SEQUENTIAL TESTS
> 
> GroupSize
> 
> 1.1904
> 
> 1.5528
>
...
> VolAuton
> 
> 2.2923
> 
> 2.2925
> 

I am not quite sure how to read this table, but I *assume* that one
of the numbers comes PRIMER and one from vegan:::capscale (this whole
answer is based on that assumption). As you see, the first two numbers
are very different, and the last two numerically fairly equal. I cannot
decipher the PRIMER formula for pseudo-F you give below, but I guess that
PRIMER changes the denominator of the pseudo-F at every step. I guess
it uses residual SS and residual df after the current term, and that 
would include variation explained by later variables in the sequential
tests as well as their df's. In vegan we always use the same denominator
in all cases with the same degrees of freedom. This denominator, or
scale, is the residual variation and residual df's after all 
explanatory variables (constraints). The vegan way is similar to the
one you get from ordinary anova of lm(). We refuse (and have refused
earlier) to implement anything else. However, using add1(<capscale-result>,
..., test = "perm") can give you tests that pretend that later 
variables in the sequence are not in the model. 

> (2) pseudo-F formula
> 
> We know that PRIMER uses the following formula to calculate the pseudo-F for
> a sequential test of significance (equation 4.3, Anderson, Gorley, and
> Clarke 2008, Chapter 4. Pg. 129, and based on pseudo-F equation in Legendre
> and Anderson (1999), Ecological Monographs vol. 69):
> 
> F= (SSFull - SSReduced)/(qFull-qReduced)
> 
>         (SSTotal-SSFull)/(N - qFull - 1)
> 
>  (3) R code
> 
> ## creating Bray-Curtis of Biodiversity data
> 
> H.BC <- vegdist(H.Full [,14:211], "bray")
> 
> ## Distance based redundancy analysis (dbRDA)
> 
> m1<-capscale(H.BC ~ GroupSize + Board + MtgStyle + DmStyle + DifView +
> VolAuton, SScomp [,14:19], distance = "euclidean", add = TRUE)
> 
> ### NOTE: pseudo-F values are the same with or without correcting for
> negative eigenvalues (although they are different from other programs).
>
If you here claim that the pseudo-F values are the same in
vegan:::capscale with add=FALSE and add=TRUE, I claim that you are wrong,
or that you made an error. One source of error may be that in the
example above you set 'distance = "euclidean"' in which case 'add'
argument has no effect since you have no negative eigenvalues with
Euclidean distances. However, the example above should *still* give
you negative eigenvalues and change in pseudo-F, because you did two
contradictory things: you supplied non-Euclidean dissimilarities in
input, and you asked for Euclidean distances in the command. In this case,
the 'distance' argument will be silently ignored and input dissimilarities
will be used. Please check this.

I have no idea what you mean with "correcting" for negative eigenvalues.
You can have transformation that removes them, but I cannot see 
how that would be a "correction".

Cheers, Jari Oksanen