[BioC] unbiased filtering of paired dataset

Wolfgang Huber whuber at embl.de
Fri Nov 2 18:11:13 CET 2012


Dear Guido

as long as you use a filter criterion that is independent of the subsequent test statistic *under the null*, you're fine wrt type I error. As far as I can think right now, e.g. the variance of within-subject effects as a filter criterion fulfils this condition, if you use something like paired t-test afterwards.

To choose which filter criterion works best (gives the best power boost), use a diagnostic plot like Fig. 1D in our paper [1].

Best wishes
	Wolfgang

[1] http://www.pnas.org/content/107/21/9546.long

Il giorno Oct 30, 2012, alle ore 4:42 PM, "Hooiveld, Guido" <Guido.Hooiveld at wur.nl> ha scritto:

> Dear listers,
> I would like to reduce my array dataset by IQR filtering. However, I have a paired design (I have samples from the same subject before and after a treatment).
> I was wondering whether IQR filtering on the normalized data as such would be recommended for such paired design, or whether it would be better to first calculate the treatment effect (after - before) for each gene in each individual followed by IQR filtering.
> I am asking because in our intervention studies the between-subject effect is normally larger than the within-subject (treatment) effect. As a result, I am afraid that I introduce a 'bias' in retaining genes that vary highly between individuals, whereas genes responding to the treatment (the relevant ones) are discarded.
> 
> I checked this on a sample dataset; if I retain the 50% most variable genes by IQR filtering I do find an overlap of only ~85% between the two approaches (7133 genes of the 8426 genes that are retained in both approaches; approach 1 is IQR filtering directly on normalized data; approach 2 is subtract AFTER minus BEFORE followed by IQR filtering).
> 
> So any suggestion on how to optimally filter a paired dataset would be appreciated.
> 
> Regards,
> Guido
> 
> ---------------------------------------------------------
> Guido Hooiveld, PhD
> Nutrition, Metabolism & Genomics Group
> Division of Human Nutrition
> Wageningen University
> Biotechnion, Bomenweg 2
> NL-6703 HD Wageningen
> the Netherlands
> tel: (+)31 317 485788
> fax: (+)31 317 483342
> email:      guido.hooiveld at wur.nl
> internet:   http://nutrigene.4t.com
> http://scholar.google.com/citations?user=qFHaMnoAAAAJ
> http://www.researcherid.com/rid/F-4912-2010
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list