[BioC] advice on absent present filtering needed

Jenny Drnevich drnevich at uiuc.edu
Thu Oct 26 16:42:18 CEST 2006


I concur - You do not want to throw out genes that only express in one 
phenotype. If you plot a histogram of the number of present calls for each 
gene, you will see that the vast majority of genes are either present in 
all samples or absent in all samples. It is only the small number of genes 
in between that your filter options will affect. To be conservative, I keep 
a gene even if it is present in only 1 sample, so I don't even consider 
phenotype. The difference really will only affect a few hundred genes, 
which won't matter too much in terms of fdr correction, so I say be 
conservative so you don't throw out a gene that is expressed in only one 
phenotype. To check the histogram:

calls.eset <- mas5calls(abatch)

hist(apply(exprs(calls.eset), 1, function(x)  sum(x=="P")))

Cheers,
Jenny

At 08:27 AM 10/26/2006, you wrote:
>Your colleague is right.  Surely it is important to know if some
>genes express only in certain phenotypes.  Your method loses this information.
>
>--Naomi
>
>
>At 10:53 PM 10/25/2006, Kimpel, Mark William wrote:
> >I have a question about how to properly apply the MAS5 absent
> >present filtering technique. Within my group, I am advocating
> >setting a cutoff ratio of absent present across phenotypes (i.e. all
> >samples), whereas a colleague is advocating applying the filter
> >within phenotype and passing through the filter any probeset with
> >the A/P ratio of >0.5 within any of the phenotypes (we have 3).
> >
> >The argument my colleague makes is that some probesets may only be
> >expressed by one phenotype and we want to keep these in, but be
> >stringent within phenotype. This makes some biologic sense, but I am
> >concerned that this filtering within phenotype will introduce bias
> >as low expression levels, as it would seem to, at least in some
> >cases, act like a fold filter at expression levels near the limit of
> >reliable detection.
> >
> >Advice?
> >
> >Mark
> >
> >Mark W. Kimpel MD
> >
> >
> >Official Business Address:
> >
> >Department of Psychiatry
> >Indiana University School of Medicine
> >PR M116
> >Institute of Psychiatric Research
> >791 Union Drive
> >Indianapolis, IN 46202
> >
> >Preferred Mailing Address:
> >
> >15032 Hunter Court
> >Westfield, IN  46074
> >
> >(317) 490-5129 Work, & Mobile
> >
> >(317) 663-0513 Home (no voice mail please)
> >1-(317)-536-2730 FAX
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://stat.ethz.ch/mailman/listinfo/bioconductor
> >Search the archives:
> >http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>Naomi S. Altman                                814-865-3791 (voice)
>Associate Professor
>Dept. of Statistics                              814-863-7114 (fax)
>Penn State University                         814-865-1348 (Statistics)
>University Park, PA 16802-2111
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list