[BioC] gcrma vs rma

Zhijin Wu zwu at stat.brown.edu
Thu Sep 2 20:31:20 CEST 2010


On 9/2/2010 1:49 PM, Lucia wrote:
>
> Hi,
> Is it possible that using gcrma has a detrimental effect on my results?
>
> I am using mouse 430 2 , 8 wt samples and 8 mutant. My mutants have a
> dominant negative mutation that I use as positive control, since the wt
> levels for the probe where the mutation is should always be higher
> The thing is that I only detect my control if I normalize using rma, not
> if I use gcrma
> it is my understanding that if you don't have a NSB experiment gcrma
> will use the mm probes to calculate nonspecific binding affinities is
> that right?

The gcrma uses the mm probes to train the model, but do not use the mm 
sequence to compute the background for the PM probe. The affinity to 
background for PM probes are computed from the PM sequence.

Array wide, the affinities on PM probes and MM probes are correlated 
since they only differ in one base after all.  But they would not be 
identical.

> I am worried that using the mm probes can cause an artifact in the
> normalization given that a lot of mm probes have higher affinity than
> the pm
> I appreciate any insight
> Thanks!
> Lucia
>
> Sent from my iPhone
>
> On Aug 17, 2010, at 5:31 AM, Tobias Straub <tstraub at med.uni-muenchen.de>
> wrote:
>
>> Hi Wolfgang,
>>
>> just an experience. in some of my analyses applying variance filtering
>> resulted in problems fitting N(0,1) to the limma t statistic. now that
>> i had a quick look at your paper I get an idea that combining limma
>> with the variance filter is anyway not a good idea.
>>
>> the performance of mas call-based filtering/limma t as compared to
>> variance filter/standard t is however (slightly) better as estimated
>> by ROC curve analysis on my prior-knowledge data (3 arrays/condition).
>> this is probably not unexpected?
>>
>> anyway thanks for pointing to the paper, apparently a must-read before
>> applying the nsFilter function.
>>
>> best regards
>> Tobias
>>
>>
>> On Aug 17, 2010, at 9:36 AM, Wolfgang Huber wrote:
>>
>>> Hi Tobias,
>>> you said you were worried about "filtering based on variance or IQR -
>>> as it jeopardizes ... applying a threshold on the local false
>>> discovery rate." I am not sure I understand what you mean, but the
>>> effect (or, if properly applied, non-effect) of filtering on type-I
>>> error is also discussed in [1] in some detail.
>>>
>>>
>>>
>>> [1] Richard Bourgon et al. Independent filtering increases detection
>>> power for high-throughput experiments. PNAS, 107(21):9546-9551, 2010.
>>> [2] Talloen et al. I/NI-calls for the exclusion of non-informative
>>> genes: a highly effective filtering tool for microarray data.
>>> Bioinformatics, doi:10.1093/bioinformatics/btm478
>>>
>>> Best wishes
>>> Wolfgang
>>>
>>>
>>> On 16/08/10 16:50, Lucia Peixoto wrote:
>>>> Thanks Tobias for your response
>>>>
>>>> I am processing data obtained with Affymetrix mouse chips (430_2,
>>>> previous
>>>> version)
>>>> The first filterning was done based on presence/absence calls, so
>>>> only genes
>>>> present in 2/17 samples were used. It is a 2 condition set up, with
>>>> 8 and 9
>>>> replicates for each condition. My definition of FDR in my previous
>>>> question
>>>> was strictly limited to validation in 8+ independent qPCRs of 40+
>>>> randomly
>>>> selected genes obtained using a SAM cutoff of 5% FDR. So I am
>>>> talking about
>>>> independently re-testing the reproducibility of gene expression,
>>>> which is
>>>> the only way to really know your FDR. Using the Mas5 presence
>>>> absence calls
>>>> filter leads to about 50% of the tested genes not being reproducible.
>>>>
>>>> If I remove the filtering and redo the analysis at 5% FDR, I get all
>>>> the the
>>>> previous "false positives" to become true positives. Which was not a
>>>> surprise to me since about 1/3 of MM probes are known to hybridize
>>>> better
>>>> than PM probes, so I don't know what Mas5 presence/absence really
>>>> means, but
>>>> definitely cannot reflect accurately the presence of a transcript if
>>>> the MM
>>>> probe hybridizes better.
>>>>
>>>> The problem is that I have a great loss of sensitivity (I have a lot of
>>>> positive controls so I know that), and I would like to increase that
>>>> using a
>>>> filter that can come closer to really defining "present", because
>>>> MM/PM does
>>>> not.
>>>> any ideas?
>>>> thanks
>>>>
>>>> Lucia
>>>>
>>>>
>>>> On Mon, Aug 16, 2010 at 8:34 AM, Tobias Straub
>>>> <tstraub at med.uni-muenchen.de>wrote:
>>>>
>>>>> Hi Lucia
>>>>>
>>>>> I am not sure if I completely understand your problem, just want to
>>>>> mention
>>>>> that I routinely apply non-specific filtering based on MAS5 calls
>>>>> with a
>>>>> very good outcome (based on a prior-knowledge training set). I do
>>>>> not like
>>>>> so much the alternative approach - filtering based on variance or
>>>>> IQR - as
>>>>> it jeopardizes my preferred way of defining responders by applying a
>>>>> threshold on the local false discovery rate.
>>>>>
>>>>> Could you extend a bit on how you exactly filter based on MAS5
>>>>> calls, how
>>>>> you define responders and non-responders in qPCR, how your "FDR
>>>>> disaster"
>>>>> exactly looks like.
>>>>>
>>>>> What is your model system by the way, which arrays you use?
>>>>>
>>>>> best regards
>>>>> T.
>>>>>
>>>>>
>>>>> On Aug 13, 2010, at 7:11 PM, Lucia Peixoto wrote:
>>>>>
>>>>>> Dear All,
>>>>>> I want to set up a non-specific filter to eliminate genes that are
>>>>>> juts
>>>>> not
>>>>>> expressed from further statistical analysis. I've previously tried a
>>>>> filter
>>>>>> based on Mas5 presence/absence calls which turned out to be a
>>>>>> disaster
>>>>> for
>>>>>> the FDR (as measured by lots of qPCRs), probably because 1/3 of
>>>>>> the MM
>>>>>> probes actually hybridize better than PM, who knows.
>>>>>>
>>>>>> In any case, my plan is to set up a filter based both on raw
>>>>>> fluorescent
>>>>>> intensity and IQR. I am trying to get as much sensitivity as possible
>>>>>> without increasing my FDR too much.
>>>>>> I was thinking that using the intensity distributions and box
>>>>>> plots of
>>>>> the
>>>>>> raw data may be useful to determine what the best cutoffs to
>>>>>> obtain the
>>>>> best
>>>>>> sensitivity will be.
>>>>>> Any advise on how to select appropriate cutoffs?
>>>>>>
>>>>>> Thank you very much in advance
>>>>>> Lucia
>>>>>>
>>>>>> [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>> ----------------------------------------------------------------------
>>>>> Dr. Tobias Straub ++4989218075439 Adolf-Butenandt-Institute, München D
>>>>>
>>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> --
>>>
>>>
>>> Wolfgang Huber
>>> EMBL
>>> http://www.embl.de/research/units/genome_biology/huber
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> ----------------------------------------------------------------------
>> Dr. Tobias Straub ++4989218075439 Adolf-Butenandt-Institute, München D
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
------------------------------------
Zhijin (Jean) Wu
Assistant Professor of Biostatistics
Brown University, Box G-S121
Providence, RI  02912

Tel: 401 863 1230
Fax: 401 863 9182
http://www.stat.brown.edu/zwu



More information about the Bioconductor mailing list