# [BioC] Re: Manova nuances

Stephen P. Baker stephen.baker at umassmed.edu
Mon Nov 24 14:18:33 MET 2003

```Michael,
One of the matrices output by svd should have dimensions = k by n or n by k
where k is number of original variates and n the number of chips, this would
be a set of eigenvectors.  This matrix times the vector of expression levels
from one chip should produce a vector of length n with values for the new
components.  There should also be either a vector or diagonal matrix ouput
with values that are in descending order; the first corresponds to the
variation in the first component, the second the next highest, etc., these
should be the eigenvalues.  Eigenvalues greater than 1 can be interpreted as
indicating that component accounts for a significant portion of the
variation in the original variables.

-.- -.. .---- .--. ..-.
Stephen P. Baker, MScPH , PhD(ABD)                      (508) 856-2625
Senior Biostatistician
(775) 254-4885 fax
Lecturer in Biostatistics , Graduate School of Biomedical Sciences
University of Massachusetts Medical School
55 Lake Avenue North                          stephen.baker at umassmed.edu
Worcester, MA 01655  USA

----- Original Message -----
From: "Michael Benjamin" <msb1129 at bellsouth.net>
To: "'Stephen P. Baker'" <stephen.baker at umassmed.edu>;
<bioconductor at stat.math.ethz.ch>
Sent: Sunday, November 23, 2003 8:53 PM
Subject: RE: [BioC] Re: Manova nuances

> This sounds very reasonable.  I'm having a bit of trouble with the
> implementation.
>
> How do you solve for a variable?  I know u * diag(d) * t(v) gets your
> original data, but how do you pick variables?  Just taking the first
> four columns of u or v alone doesn't work--I tried.  There must be a way
> to combine d, u, and v to represent the first few variables in
> low-dimensional space.
>
> In other words, after you do svd, then what?  What do you compare? U? V?
>
> Thanks,
> Mike
>
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Stephen P.
> Baker
> Sent: Saturday, November 22, 2003 8:57 AM
> To: Michael Benjamin; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Re: Manova nuances
>
> The principal components are orthogonal and independent and measure
> different things so it makes no sense to compare them, like comparing
> horizontal to vertical or heart rate to IQ.
>
> Treat the components like variables and perform the same analysis on
> them
> with ANOVA that you would have with MANOVA.  Like 2 groups, do a t-test,
> 3
> do ANOVA, whatever analysis is appropriate for your experimental design.
> If
> ANY of these are significant, the MANOVA would have been significant.
>
> Stephen
> -.- -.. .---- .--. ..-.
> Stephen P. Baker, MScPH , PhD(ABD)                      (508) 856-2625
> Senior Biostatistician
> (775) 254-4885 fax
> Lecturer in Biostatistics , Graduate School of Biomedical Sciences
> University of Massachusetts Medical School
> 55 Lake Avenue North                          stephen.baker at umassmed.edu
> Worcester, MA 01655  USA
>
> ----- Original Message -----
> From: "Michael Benjamin" <msb1129 at bellsouth.net>
> To: "'Baker, Stephen'" <Stephen.Baker at umassmed.edu>; "'Liaw, Andy'"
> <andy_liaw at merck.com>; <bioconductor at stat.math.ethz.ch>
> Sent: Friday, November 21, 2003 11:11 PM
> Subject: RE: [BioC] Re: Manova nuances
>
>
> > Can I do instead:
> > comps1<-svd(teset[group1])\$d
> > comps2<-svd(teset[group2])\$d
> > t.test(comps1,comps2)
> >
> > Maybe I could just compare the top two or three components to one
> > another?
> >
> > Mike
> >
> > -----Original Message-----
> > From: Baker, Stephen [mailto:Stephen.Baker at umassmed.edu]
> > Sent: Friday, November 21, 2003 3:05 PM
> > To: Liaw, Andy; bioconductor at stat.math.ethz.ch
> > Cc: msb1129 at bellsouth.net
> > Subject: RE: [BioC] Re: Manova nuances
> >
> > Andy et al.
> >
> > (Thanks for correcting my typo on the spelling of "principal").
> >
> > Yes I know that ANOVA of n principal components will result in n
> > p-values, however the SMALLEST p-value will be equivalent to a
> > multivariate test of his hypotheses on his data.
> >
> > MANOVA and univariate ANOVA on the principal components are
> essentially
> > equivalent in theory and quite similar in that both approaches involve
> > the characteristic roots and functions of the same design and
> covariance
> > matrices.
> >
> > The equivalence is based on the fact that multivariate hypotheses will
> > be rejected only if the equivalent univariate hypotheses do not hold
> for
> > all variates (Morrison,1976).  Principle components simply transforms
> > the original variates into new variates which conserve all the
> original
> > information. Michael Benjamin's problem is that he CANNOT run MANOVA
> as
> > he has fewer cases than variates however my suggested approach WOULD
> > work.
> >
> > With regards to Michael's request/need for a SINGLE SUMMARY STATISTIC,
> > he would use the minimum of the p-values for the appropriate effect
> from
> > the univariate ANOVA's on the principal components as his single
> > p-value.  These are orthogonal tests and the minimum would be
> equivalent
> > to testing the same hypotheses with MANOVA on his dataset.
> >
> >
> > The only caveat is that with K genes and n<K and he will be able to
> test
> > his hypotheses on the first n principal components which account for
> the
> > largest portions of the variation.  However, in my 20 years
> experience,
> > in most datasets the number of "significant" components (with
> > eigenvalues >1) is usually much smaller than the number of variates.
> It
> > would be unusual for any real biological effect to not be represented
> > among one or more of the first n components given n is not too small.
> In
> > his case that's 35 and I think that's probably enough.
> >
> > Best wishes
> > Stephen
> >
> > -.- -.. .---- .--. ..-.
> > Stephen P. Baker, MScPH, PhD (ABD)            (508) 856-2625
> > Sr. Biostatistician- Information Services
> > Lecturer in Biostatistics                     (775) 254-4885 fax
> > Graduate School of Biomedical Sciences
> > University of Massachusetts Medical School, Worcester
> > 55 Lake Avenue North
> stephen.baker at umassmed.edu
> > Worcester, MA 01655  USA
> >
> >
> >
> > -----Original Message-----
> > From: Liaw, Andy [mailto:andy_liaw at merck.com]
> > Sent: Friday, November 21, 2003 8:13 AM
> > To: Baker, Stephen; bioconductor at stat.math.ethz.ch
> > Cc: 'msb1129 at bellsouth.net'
> > Subject: RE: [BioC] Re: Manova nuances
> >
> >
> > > From: Stephen P. Baker [mailto:stephen.baker at umassmed.edu]
> > >
> > > Principle component analyses should reduce your data array to
> >   ^^^^^^^^^
> >   Principal
> >
> > > as many independent components as you have samples, and  for
> > > each sample get a score for each dimension.  These will have
> > > the same total information as the original data.  These can
> > > then be analysed separately with univariate anova but since
> > > these are "orthogonal" analyses, multiple comparisons
> > > adjustments would not be needed.
> >
> > The analysis you described is quite different than MANOVA, so
> > the conclusion/interpretation would be quite different, too. MANOVA
> > treats the data as coming from multivariate normal distribution, and
> > tests whether all groups have the same mean vector.  What you
> described
> > is n (number of samples) ANOVA analyses that gives n p-values.
> >
> > Cheers,
> > Andy
> > Andy Liaw, PhD
> > Biometrics Research      PO Box 2000, RY33-300
> > Merck Research Labs           Rahway, NJ 07065
> > mailto:andy_liaw at merck.com        732-594-0820
> >
> >
> >
> > > -.- -.. .---- .--. ..-.
> > > Stephen P. Baker, MScPH , PhD(ABD)                      (508)
> 856-2625
> > > Senior Biostatistician
> > > (775) 254-4885 fax
> > > Academic Computing Services
> > > Lecturer in Biostatistics , Graduate School of Biomedical
> > > Sciences University of Massachusetts Medical School
> > > 55 Lake Avenue North
> > > stephen.baker at umassmed.edu
> > > Worcester, MA 01655  USA
> > > --------------------------------------------------------------
> > > --------------
> > > ----
> > > Date: Fri, 21 Nov 2003 00:18:54 -0500
> > > From: "Michael Benjamin" <msb1129 at bellsouth.net>
> > > Subject: [BioC] Manova nuances
> > > To: <bioconductor at stat.math.ethz.ch>
> > > Message-ID: <003401c3afee\$f7eff000\$7a05fea9 at amd>
> > > Content-Type: text/plain; charset="US-ASCII"
> > >
> > >
> > > Anybody here using manova?  It's powerful and pretty fast,
> > > but I'm finding that you can't have more variables than
> > > samples (limits its applicability to microarray research).
> > > Is there any way around this? Assume
> > >
> > > dim(eset)
> > >
> > > 1200 35
> > >
> > > transeset<-t(eset)
> > > fit<-manova(transeset ~ categories)
> > > summary(fit)
> > >
> > > There is probably a complicated mathematical truth that
> > > underlies this limitation--if anybody can shed some light,
> > > that would be great.
> > >
> > > Also, if anyone knows of a quick, free multivariate tool that
> > > summarizes all the tests into a single test statistic, that
> > > would be much appreciated.
> > >
> > > Regards,
> > > Michael Benjamin, MD
> > > Emory University
> > > Winship Cancer Institute
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://www.stat.math.ethz.ch/mailman/listinfo> /bioconductor
> > >
> >
> >
> >
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
>
>

```