[BioC] gcrma vs rma

Thu Sep 2 19:49:37 CEST 2010

Hi,
Is it possible that using gcrma has a detrimental effect on my results?

I am using mouse 430 2 , 8 wt samples and 8 mutant. My mutants have a  
dominant negative mutation that I use as positive control, since the  
wt levels for the probe where the mutation is should always be higher
The thing is that I only detect my control if I normalize using rma,  
not if I use gcrma
it is my understanding that if you don't have a NSB experiment gcrma  
will use the mm probes to calculate nonspecific binding affinities is  
that right?
I am worried that using the mm probes can cause an artifact in the  
normalization given that a lot of mm probes have higher affinity than  
the pm
I appreciate any insight
Thanks!
Lucia

Sent from my iPhone

On Aug 17, 2010, at 5:31 AM, Tobias Straub <tstraub at med.uni- 
muenchen.de> wrote:

> Hi Wolfgang,
>
> just an experience. in some of my analyses applying variance  
> filtering resulted in problems fitting N(0,1) to the limma t  
> statistic. now that i had a quick look at your paper I get an idea  
> that combining limma with the variance filter is anyway not a good  
> idea.
>
> the performance of mas call-based filtering/limma t as compared to  
> variance filter/standard t is however (slightly) better as estimated  
> by ROC curve analysis on my prior-knowledge data (3 arrays/ 
> condition). this is probably not unexpected?
>
> anyway thanks for pointing to the paper, apparently a must-read  
> before applying the nsFilter function.
>
> best regards
> Tobias
>
>
> On Aug 17, 2010, at 9:36 AM, Wolfgang Huber wrote:
>
>> Hi Tobias,
>> you said you were worried about "filtering based on variance or IQR  
>> - as it jeopardizes ... applying a threshold on the local false  
>> discovery rate." I am not sure I understand what you mean, but the  
>> effect (or, if properly applied, non-effect) of filtering on type-I  
>> error is also discussed in [1] in some detail.
>>
>>
>>
>> [1] Richard Bourgon et al. Independent filtering increases  
>> detection power for high-throughput experiments. PNAS, 107(21):9546-9551 
>> , 2010.
>> [2] Talloen et al. I/NI-calls for the exclusion of non-informative  
>> genes: a highly effective filtering tool for microarray data.
>> Bioinformatics, doi:10.1093/bioinformatics/btm478
>>
>>    Best wishes
>>    Wolfgang
>>
>>
>> On 16/08/10 16:50, Lucia Peixoto wrote:
>>> Thanks Tobias for your response
>>>
>>> I am processing data obtained with Affymetrix mouse chips (430_2,  
>>> previous
>>> version)
>>> The first filterning was done based on presence/absence calls, so  
>>> only genes
>>> present in 2/17 samples were used. It is a 2 condition set up,  
>>> with 8 and 9
>>> replicates for each condition. My definition of FDR in my previous  
>>> question
>>> was strictly limited to validation in 8+ independent qPCRs of 40+  
>>> randomly
>>> selected genes obtained using a SAM cutoff of 5% FDR. So I am  
>>> talking about
>>> independently re-testing the reproducibility of gene expression,  
>>> which is
>>> the only way to really know your FDR. Using the Mas5 presence  
>>> absence calls
>>> filter leads to about 50% of the tested genes not being  
>>> reproducible.
>>>
>>> If I remove the filtering and redo the analysis at 5% FDR, I get  
>>> all the the
>>> previous "false positives" to become true positives. Which was not a
>>> surprise to me since about 1/3 of MM probes are known to hybridize  
>>> better
>>> than PM probes, so I don't know what Mas5 presence/absence really  
>>> means, but
>>> definitely cannot reflect accurately the presence of a transcript  
>>> if the MM
>>> probe hybridizes better.
>>>
>>> The problem is that I have a great loss of sensitivity (I have a  
>>> lot of
>>> positive controls so I know that), and I would like to increase  
>>> that using a
>>> filter that can come closer to really defining "present", because  
>>> MM/PM does
>>> not.
>>> any ideas?
>>> thanks
>>>
>>> Lucia
>>>
>>>
>>> On Mon, Aug 16, 2010 at 8:34 AM, Tobias Straub
>>> <tstraub at med.uni-muenchen.de>wrote:
>>>
>>>> Hi Lucia
>>>>
>>>> I am not sure if I completely understand your problem, just want  
>>>> to mention
>>>> that I routinely apply non-specific filtering based on MAS5 calls  
>>>> with a
>>>> very good outcome (based on a prior-knowledge training set). I do  
>>>> not like
>>>> so much the alternative approach - filtering based on variance or  
>>>> IQR -  as
>>>> it jeopardizes my preferred way of defining responders by  
>>>> applying a
>>>> threshold on the local false discovery rate.
>>>>
>>>> Could you extend a bit on how you exactly filter based on MAS5  
>>>> calls, how
>>>> you define responders and non-responders in qPCR, how your "FDR  
>>>> disaster"
>>>> exactly looks like.
>>>>
>>>> What is your model system by the way, which arrays you use?
>>>>
>>>> best regards
>>>> T.
>>>>
>>>>
>>>> On Aug 13, 2010, at 7:11 PM, Lucia Peixoto wrote:
>>>>
>>>>> Dear All,
>>>>> I want to set up a non-specific filter to eliminate genes that  
>>>>> are juts
>>>> not
>>>>> expressed from further statistical analysis. I've previously  
>>>>> tried a
>>>> filter
>>>>> based on Mas5 presence/absence calls which turned out to be a  
>>>>> disaster
>>>> for
>>>>> the FDR (as measured by lots of qPCRs), probably because 1/3 of  
>>>>> the MM
>>>>> probes actually hybridize better than PM, who knows.
>>>>>
>>>>> In any case, my plan is to set up a filter based both on raw  
>>>>> fluorescent
>>>>> intensity and IQR. I am trying to get as much sensitivity as  
>>>>> possible
>>>>> without increasing my FDR too much.
>>>>> I was thinking that using the intensity distributions and box  
>>>>> plots of
>>>> the
>>>>> raw data may be useful to determine what the best cutoffs to  
>>>>> obtain the
>>>> best
>>>>> sensitivity will be.
>>>>> Any advise on how to select appropriate cutoffs?
>>>>>
>>>>> Thank you very much in advance
>>>>> Lucia
>>>>>
>>>>>      [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>> --- 
>>>> -------------------------------------------------------------------
>>>> Dr. Tobias Straub ++4989218075439 Adolf-Butenandt-Institute, M 
>>>> ünchen D
>>>>
>>>>
>>>
>>>    [[alternative HTML version deleted]]
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> -- 
>>
>>
>> Wolfgang Huber
>> EMBL
>> http://www.embl.de/research/units/genome_biology/huber
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> ----------------------------------------------------------------------
> Dr. Tobias Straub ++4989218075439 Adolf-Butenandt-Institute, München 
>  D
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor