[BioC] Re: Manova nuances

Michael Benjamin msb1129 at bellsouth.net
Mon Nov 24 02:53:44 MET 2003


This sounds very reasonable.  I'm having a bit of trouble with the
implementation.

How do you solve for a variable?  I know u * diag(d) * t(v) gets your
original data, but how do you pick variables?  Just taking the first
four columns of u or v alone doesn't work--I tried.  There must be a way
to combine d, u, and v to represent the first few variables in
low-dimensional space.

In other words, after you do svd, then what?  What do you compare? U? V?

Thanks,
Mike


-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Stephen P.
Baker
Sent: Saturday, November 22, 2003 8:57 AM
To: Michael Benjamin; bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Re: Manova nuances

The principal components are orthogonal and independent and measure
different things so it makes no sense to compare them, like comparing
horizontal to vertical or heart rate to IQ.

Treat the components like variables and perform the same analysis on
them
with ANOVA that you would have with MANOVA.  Like 2 groups, do a t-test,
3
do ANOVA, whatever analysis is appropriate for your experimental design.
If
ANY of these are significant, the MANOVA would have been significant.

Stephen
-.- -.. .---- .--. ..-.
Stephen P. Baker, MScPH , PhD(ABD)                      (508) 856-2625
Senior Biostatistician
(775) 254-4885 fax
Academic Computing Services
Lecturer in Biostatistics , Graduate School of Biomedical Sciences
University of Massachusetts Medical School
55 Lake Avenue North                          stephen.baker at umassmed.edu
Worcester, MA 01655  USA

----- Original Message ----- 
From: "Michael Benjamin" <msb1129 at bellsouth.net>
To: "'Baker, Stephen'" <Stephen.Baker at umassmed.edu>; "'Liaw, Andy'"
<andy_liaw at merck.com>; <bioconductor at stat.math.ethz.ch>
Sent: Friday, November 21, 2003 11:11 PM
Subject: RE: [BioC] Re: Manova nuances


> Can I do instead:
> comps1<-svd(teset[group1])$d
> comps2<-svd(teset[group2])$d
> t.test(comps1,comps2)
>
> Maybe I could just compare the top two or three components to one
> another?
>
> Mike
>
> -----Original Message-----
> From: Baker, Stephen [mailto:Stephen.Baker at umassmed.edu]
> Sent: Friday, November 21, 2003 3:05 PM
> To: Liaw, Andy; bioconductor at stat.math.ethz.ch
> Cc: msb1129 at bellsouth.net
> Subject: RE: [BioC] Re: Manova nuances
>
> Andy et al.
>
> (Thanks for correcting my typo on the spelling of "principal").
>
> Yes I know that ANOVA of n principal components will result in n
> p-values, however the SMALLEST p-value will be equivalent to a
> multivariate test of his hypotheses on his data.
>
> MANOVA and univariate ANOVA on the principal components are
essentially
> equivalent in theory and quite similar in that both approaches involve
> the characteristic roots and functions of the same design and
covariance
> matrices.
>
> The equivalence is based on the fact that multivariate hypotheses will
> be rejected only if the equivalent univariate hypotheses do not hold
for
> all variates (Morrison,1976).  Principle components simply transforms
> the original variates into new variates which conserve all the
original
> information. Michael Benjamin's problem is that he CANNOT run MANOVA
as
> he has fewer cases than variates however my suggested approach WOULD
> work.
>
> With regards to Michael's request/need for a SINGLE SUMMARY STATISTIC,
> he would use the minimum of the p-values for the appropriate effect
from
> the univariate ANOVA's on the principal components as his single
> p-value.  These are orthogonal tests and the minimum would be
equivalent
> to testing the same hypotheses with MANOVA on his dataset.
>
>
> The only caveat is that with K genes and n<K and he will be able to
test
> his hypotheses on the first n principal components which account for
the
> largest portions of the variation.  However, in my 20 years
experience,
> in most datasets the number of "significant" components (with
> eigenvalues >1) is usually much smaller than the number of variates.
It
> would be unusual for any real biological effect to not be represented
> among one or more of the first n components given n is not too small.
In
> his case that's 35 and I think that's probably enough.
>
> Best wishes
> Stephen
>
> -.- -.. .---- .--. ..-.
> Stephen P. Baker, MScPH, PhD (ABD)            (508) 856-2625
> Sr. Biostatistician- Information Services
> Lecturer in Biostatistics                     (775) 254-4885 fax
> Graduate School of Biomedical Sciences
> University of Massachusetts Medical School, Worcester
> 55 Lake Avenue North
stephen.baker at umassmed.edu
> Worcester, MA 01655  USA
>
>
>
> -----Original Message-----
> From: Liaw, Andy [mailto:andy_liaw at merck.com]
> Sent: Friday, November 21, 2003 8:13 AM
> To: Baker, Stephen; bioconductor at stat.math.ethz.ch
> Cc: 'msb1129 at bellsouth.net'
> Subject: RE: [BioC] Re: Manova nuances
>
>
> > From: Stephen P. Baker [mailto:stephen.baker at umassmed.edu]
> >
> > Principle component analyses should reduce your data array to
>   ^^^^^^^^^
>   Principal
>
> > as many independent components as you have samples, and  for
> > each sample get a score for each dimension.  These will have
> > the same total information as the original data.  These can
> > then be analysed separately with univariate anova but since
> > these are "orthogonal" analyses, multiple comparisons
> > adjustments would not be needed.
>
> The analysis you described is quite different than MANOVA, so
> the conclusion/interpretation would be quite different, too. MANOVA
> treats the data as coming from multivariate normal distribution, and
> tests whether all groups have the same mean vector.  What you
described
> is n (number of samples) ANOVA analyses that gives n p-values.
>
> Cheers,
> Andy
> Andy Liaw, PhD
> Biometrics Research      PO Box 2000, RY33-300
> Merck Research Labs           Rahway, NJ 07065
> mailto:andy_liaw at merck.com        732-594-0820
>
>
>
> > -.- -.. .---- .--. ..-.
> > Stephen P. Baker, MScPH , PhD(ABD)                      (508)
856-2625
> > Senior Biostatistician
> > (775) 254-4885 fax
> > Academic Computing Services
> > Lecturer in Biostatistics , Graduate School of Biomedical
> > Sciences University of Massachusetts Medical School
> > 55 Lake Avenue North
> > stephen.baker at umassmed.edu
> > Worcester, MA 01655  USA
> > --------------------------------------------------------------
> > --------------
> > ----
> > Date: Fri, 21 Nov 2003 00:18:54 -0500
> > From: "Michael Benjamin" <msb1129 at bellsouth.net>
> > Subject: [BioC] Manova nuances
> > To: <bioconductor at stat.math.ethz.ch>
> > Message-ID: <003401c3afee$f7eff000$7a05fea9 at amd>
> > Content-Type: text/plain; charset="US-ASCII"
> >
> >
> > Anybody here using manova?  It's powerful and pretty fast,
> > but I'm finding that you can't have more variables than
> > samples (limits its applicability to microarray research).
> > Is there any way around this? Assume
> >
> > dim(eset)
> >
> > 1200 35
> >
> > transeset<-t(eset)
> > fit<-manova(transeset ~ categories)
> > summary(fit)
> >
> > There is probably a complicated mathematical truth that
> > underlies this limitation--if anybody can shed some light,
> > that would be great.
> >
> > Also, if anyone knows of a quick, free multivariate tool that
> > summarizes all the tests into a single test statistic, that
> > would be much appreciated.
> >
> > Regards,
> > Michael Benjamin, MD
> > Emory University
> > Winship Cancer Institute
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://www.stat.math.ethz.ch/mailman/listinfo> /bioconductor
> >
>
>
>

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor



More information about the Bioconductor mailing list