cashaw at bcm.tmc.edu
Thu Dec 18 19:10:04 MET 2003
> I agree with some of WHAT you say CHAD, the PROBLEM is THAT MOST
> multiVARIATE methods are BUILt on top OF the marginal tests. FOR instance
> machine learning methods are based on gene subsets for each of k CROSS
Right. I recognize that gene selection is a central component of many
sequential data analysis
schemes-- "at stage 1" pick a set of genes which (by a selection scheme)
show regulation in the
array experiments -- then at stage 2 you do something with that.
My comment is STILL that this is a bad approach. I'm guilty of it, too.
We are focusing on the trees instead of the ecosystem -- and if we had
info/ knowledge of gene-connectedness we wouldnt be doing this.
Moreover, if what you are doing at stage 2-k is based on 'binning' of
then a low frequency false positives at stage 1 will matter less, and so
will slightly sub-optimal
single gene power.
> USE of the appropriate TEST (fold/T/F/cyber-T/etc..)for subset
> selection is IMHO the most IMPORTANT!! choice .
Yes I agree. Its just that THE FIXATION on this topic to the exclusion
seem to be scientificially relevant other topics is BOTH maddening and
More information about the Bioconductor