[BioC] topTable

Jenny Drnevich drnevich at uiuc.edu
Tue Aug 28 16:40:24 CEST 2007


Hi Lev,

I think you are a little fixated on removing probes that are "bad" in 
one of your two contrasts. I don't think it's that serious of an 
issue, and I don't know anyone else who worries about it either. 
Especially since as you mention, there are not that many "bad" 
probes. It's highly unlikely that they would be significant anyway, 
so I don't see why you are so set on removing them.  At most, I would 
only worry about checking significant genes in each contrast. Even if 
they slipped through, you are expecting some false positives in your 
list anyway, so I don't think they would radically affect the 
conclusions drawn from the lists.  You're analysis steps 1-4 are 
fine, and I would stop there.

That's my 2 cents,
Jenny

>   I do some analysis in LIMMA and would be very grateful for your comments.
>   I have three treatments: 1, 2 and 3, comparing 2vs.1 and 3vs.1. 
> Then I analyse the created lists further, identifying genes that 
> are different/similar between the contrasts. As suggested earlier 
> on this Lists I:
>   1. normalise using ALL the data;
>   2. filter out probes which are not expressed across ALL 
> treatments 1, 2 and 3;
>   3. run LIMMA on the filtered data;
>   4. produce two gene lists for the two contrasts 2vs1 and 3vs1, 
> using topTable.
>
>   To take the full advantage of LIMMA, in the above steps 3 and 4, 
> I process the data for all treatments together:
>   design <- model.matrix(~0 +factor(c(1,1,1,2,2,2,3,3,3)))
>   colnames(design) <- c("group1", "group2", "group3")
>   contrast.matrix <- makeContrasts(group2-group1, 
> group3-group1,levels=design)
>   fit <- lmFit(data_normalised_filtered, design)
>   fit2 <- contrasts.fit(fit, contrast.matrix)
>   fit2 <- eBayes(fit2)
>   topTable(fit2, coef=1, adjust="BH")
>   topTable(fit2, coef=2, adjust="BH")
>
>   This means that some probes may have meaningless results for one 
> of the two contrasts. For example, if probe A is "not expressed" in 
> 1 and 2, but is "expressed" in 3, it will be kept in the analysis 
> (step 2), but obviously its fold change or p-values will be 
> meaningless for the 2vs.1 comparison (because we are comparing 
> noise vs. noise here). Recognising this, as the 5th step of my 
> procedure (after running topTable), I remove probes such as A from 
> the topTable results for the comparison 2vs.1, but keep them in the 
> results for the comparison 3vs.1.
>   So, for example, the topTable for the contrast 2vs.1:
>             ID     logFC          t      P.Value  adj.P.Val        B
>   X -3.58 -14.19 1.068322e-06 0.0164 3.839
>   Y -4.71 -13.02 2.000032e-06 0.0164 3.589
>   A -2.52 -11.94 3.721566e-06 0.0203 3.315
>   Z -2.19 -11.17 5.993895e-06 0.0222 3.086
>   Will become:
>             ID     logFC          t      P.Value  adj.P.Val        B
>   X -3.58 -14.19 1.068322e-06 0.0164 3.839
>   Y -4.71 -13.02 2.000032e-06 0.0164 3.589
>   Z -2.19 -11.17 5.993895e-06 0.0222 3.086
>
>   The other way to make comparisons 2vs.1 and 3vs.1 would be to 
> process them separately, doing filtering for each pair separately 
> as well. But then it would decrease the power.
>   I realise that keeping such partially "bad" probes (probes that 
> are "bad" in one comparison, but are "good" in the other) and 
> removing them after running the topTable can adversely affect 
> "good" probes. It can happen either through eBayes or through the 
> multiple testing correction. My perception is that it would not 
> affect the results a lot, because the "bad" probes are not 
> numerous. Besides, probe rankings should remain the same.
>   Would you say that what I described above is a sensible way to go?
>
>   Looking forward to your replies,
>   Lev.
>
>
>---------------------------------
>
>         [[alternative HTML version deleted]]
>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list