[BioC] nsFilter cutoff

James W. MacDonald jmacdon at med.umich.edu
Mon Jun 23 20:34:53 CEST 2008


Hi James,

james perkins wrote:
> Hi James
> 
> I meant when we have filterByQuantile as TRUE. In this case it seems to 
> behave differently, and I can't figure out why, and I don't want to guess!

OK. That's a different question. The details section of the help page 
explains this:

Note that by default the numerical-filter cutoff is interpreted as
      a quantile, so leaving the default values intact would filter out
      50% of the genes remaining at this stage. If you prefer to set the
      cutoff at some absolute threshold, change the value of
      'varByQuantile' to 'FALSE', and modify 'var.cutoff' accordingly.

And looking at the code should help further:


  if (var.filter) {
         esetIqr <- apply(exprs(eset), 1, var.func)
         if (filterByQuantile) {
             if (0 < var.cutoff && var.cutoff < 1) {
                 var.cutoff = quantile(esetIqr, var.cutoff)
             }
             else stop("Cutoff Quantile has to be between 0 and 1.")
         }
         selected <- esetIqr > var.cutoff

So if you leave varByQuantile = TRUE then after you do the 
annotation-based filtering (GO, Entrez Gene, AFFX probesets, 
duplicates), you will take what remains and filter out 50%.

Does that help?

Best,

Jim


> 
> Regards,
> 
> Jim
> 
> James W. MacDonald wrote:
>> Hi James,
>>
>> james perkins wrote:
>>> Hi,
>>>
>>> I am finding the nsFilter IQR cutoff somewhat confusing.
>>>
>>> It says it is using IQR with a default cutoff of 0.5.
>>>
>>> This gives the impression that if you line up the data and take the 
>>> value between the 0.25 and 0.75 quantiles, you would keep the 
>>> probeset if this value was < 0.5
>>>
>>> However this is not the case, so I would like to know how exactly 
>>> does this work?
>>
>> Actually it _is_ the case - perhaps you misunderstand something.
>>
>> First, get all probesets with an IQR > 0.5
>> > T1 <- apply(exprs(sample.ExpressionSet), 1, IQR) > 0.5
>>
>> Now do the same using nsFilter()
>> > T2 <- nsFilter(sample.ExpressionSet, FALSE, filterByQuantile = 
>> FALSE, feature.exclude="", remove.dupEntrez = FALSE)
>>
>> Are they the same?
>> > all.equal(featureNames(sample.ExpressionSet)[T1], 
>> featureNames(T2$eset))
>> [1] TRUE
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>>
>>> Regards,
>>>
>>> James
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list