[BioC] Pre-filtering and the gene universe of gene set tests in microarray analysis

Arno, Matthew matthew.arno at kcl.ac.uk
Wed Jun 15 10:29:50 CEST 2011

Does that mean that there may be valid and invalid reasons for excluding or filtering out genes prior to a statistical filtering (e.g. t-test). Some examples are low expression values throughout dataset, and low variation across dataset (i.e. likely to be 'housekeeping-type' genes). I would say its fine to exclude these if you are fully aware of the risks and you're just looking for a few genes for an informal study. Whether this is acceptable as publication standard data is a separate question, and I don't know the answer. What do the MIAME chaps say about this? Are there any guidelines there?


Matthew Arno, Ph.D.
Genomics Centre Manager
King's College London
The contents of this email are strictly confidential. It may not be transmitted in part or in whole to any other individual or groups of individuals.
This email is intended solely for the use of the individual(s) to whom they are addressed and should not be released to any third party without the consent of the sender.

>-----Original Message-----
>From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-
>project.org] On Behalf Of François Lefebvre
>Sent: 14 June 2011 20:59
>To: bioconductor at r-project.org
>Subject: [BioC] Pre-filtering and the gene universe of gene set tests in
>microarray analysis
>This question regards how various gene set testing methods deal with
>pre-filtered (non-specifically) data sets.
>Don't genes in a gene set that did not pass a filter constitute
>important evidence "against" that gene set? Not taking them into account
>when calculating whatever gene set summary statistic seems wrong (e.g.
>as recommended in chapters 13 & 14 of "Bioconductor Case Studies", if I
>read correctly). To put it differently, excluding a gene in the summary
>statistic calculation "because it is likely not to be interesting" seems
>different than excluding it because it was not on the chip in the first
>François Lefebvre
>	[[alternative HTML version deleted]]

More information about the Bioconductor mailing list