[BioC] Filtering before differential expression analysis of microarrays - New paper out (James W. MacDonald)

Sherosha Raj sherosha at gmail.com
Tue Jan 13 19:09:29 CET 2009


Hello Jenny

This is how I setup the filters:

#setup filters
> f1=pOverA(0.25,log2(100))
> f2=function(x)(IQR(x)>0.5)
> ff=filterfun(f1,f2)

Around here I sub-select the probesets that come through the filter
from my expression set.
 then proceed to

#LIMMA

>targets=readTargets("targets.txt",sep="")
> WD=paste(targets$.....)
> WD=factor(WD,levels=c("........."))
> design=model.matrix(~0+WD)
> colnames(design)=levels(WD)
> fit=lmFit(all.esetsub[,61:102],design) #from a large eset normalised over 102 chips so subsetting the relevant cel files

#Contrast matrix
> contmatrix=makeContrasts(.........,levels=design)

>fit2=contrasts.fit(fit,contmatrix)

If I were to filter here using the two filters above......

>selected=genefilter(fit2,ff)
> sum(selected)
[1] 0
> class(fit2q)
[1] "MArrayLM"
attr(,"package")
[1] "limma"

#When I filter before starting limma, I get 11504 probesets coming through.
#I am confused how to proceed with the next steps....(i.e subset the
fit2 object and apply the eBayes)..:-(

#previously proceeded as follows after the "contrasts.fit" step:
>fit2=eBayes(fit2)
> changinggenes.05=decideTests(fit2,adjust.method="BH",p.value=0.05)

etc etc


I have been previously using filters before limma, but I 've been
following the discussions on this board and would try to see how the
data looks if I filtered prior o the eBayes step.


Any help is greatly appreciated!!
Thank you very much!
Regards,
Sherosha

2009/1/13 Jenny Drnevich <drnevich at illinois.edu>:
> Hi Sherosha,
>
> In general, you can filter by subsetting a MArrayLM object the exact same
> way as you would an ExpressionSet object. If you have any trouble, please
> post the code that you are trying to use.
>
> Cheers,
> Jenny
>
> At 10:47 AM 1/13/2009, Sherosha Raj wrote:
>>
>> Hello all
>>
>> I"m sorry if this is a simple question, but how does one go about
>> filtering after the eBayes step since the resulting object is of the
>> class MArrayLM?
>> I am used to filtering expression sets directly.
>>
>> Thank you very much!
>> Sherosha
>> >
>> > ---------- Forwarded message ----------
>> > From: "James W. MacDonald" <jmacdon at med.umich.edu>
>> > To: Daniel Brewer <daniel.brewer at icr.ac.uk>
>> > Date: Mon, 12 Jan 2009 09:25:02 -0500
>> > Subject: Re: [BioC] Filtering before differential expression analysis of
>> > microarrays - New paper out
>> > Hi Dan,
>> >
>> > Daniel Brewer wrote:
>> >>
>> >> Hi,
>> >>
>> >> There is a new paper out at BMC bioinformatics that seems to justify
>> >> the
>> >> use of filtering before differential expression analysis is performed
>> >> (Hackstadt & Hess BMC Bioinformatics 2009, 10:11 -
>> >> http://www.biomedcentral.com/1471-2105/10/11/abstract).  Specifically
>> >> filtering by variance and detection call.  I have got the impression
>> >> from this list that the general opinion is that one should only filter
>> >> out the control genes before testing.  I was wondering if anyone had
>> >> any
>> >> opinions on this paper and the topic in general.
>> >
>> > I'm sure people do have opinions about this topic ;-D
>> >
>> > The reason people have so many opinions is because it isn't a simple
>> > question, and it depends on what you consider important.
>> >
>> > If you are just trying to limit the number of multiple comparisons to
>> > increase power, then filtering first is probably the way to go.
>> >
>> > If you are concerned with the accuracy of the FDR estimates, then
>> > filtering first may not be ideal.
>> >
>> > If you are using limma (Hackstadt and Hess used multtest), then you
>> > should filter after the eBayes step but before the FDR step, as an
>> > assumption of the eBayes step is that all of the data from the chip are
>> > available.
>> >
>> > Unless of course you are concerned about the accuracy of the FDR
>> > estimates, in which case... well you see the point.
>> >
>> > With microarray data analysis the arguments for and against a particular
>> > way of doing things can shed more heat than light, as nobody really knows
>> > the underlying truth, and the measures we use are really far removed from
>> > the actual phenomenon we are testing.
>> >
>> > Best,
>> >
>> > Jim
>> >
>> >
>> >>
>> >> Many thanks
>> >>
>> >> Dan
>> >>
>> >
>> > --
>> > James W. MacDonald, M.S.
>> > Biostatistician
>> > Hildebrandt Lab
>> > 8220D MSRB III
>> > 1150 W. Medical Center Drive
>> > Ann Arbor MI 48109-5646
>> > 734-936-8662
>> >
>> >
>> >
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
>
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
>



More information about the Bioconductor mailing list