[BioC] remove.dupEntrez from nsFilter{genefilter}

Martin Morgan mtmorgan at fhcrc.org
Tue May 29 23:42:20 CEST 2012


On 05/29/2012 10:45 AM, Klemens Vierlinger [guest] wrote:
>
> Dear List,
> I seem to have a problem with the nsFilter function. For genes which are represented by more than one probe, it should keep the probe with the highest IQR and delete the others. It seems to me that in the example below this is not the case.
>
> Any ideas why?

Hi Klemens -- Nice question. When I look at ?nsFilter and 
selectMethod(nsFilter, "ExpressionSet"), I see IQR's calculated as

 > iqr = genefilter:::rowIQRs(exprs(test.es))
 > iqr[fData(test.es)$ENTREZ %in% 10000]
[1] 2.162743 2.177948

When I look at ?IQR and ?quantile, I see that there are 9 types of IQR 
from which I could chose. With type=3 I get

 > apply(exprs(test.es)[fData(test.es)$ENTREZ %in% 10000, ], 1, IQR, type=3)
A_23_P160354 A_24_P110983
     2.162743     2.177948

Apparently the recommended is 8 and R defaults to 7. I could use

 > remDup <- nsFilter(test.es, var.filter=FALSE, var.func=IQR)$eset
 > featureNames(remDup)[fData(remDup)$ENTREZ %in% 10000]
[1] "A_23_P160354"

or var.func = function(x) IQR(x, type=8)

Martin

> Best
> Klemens
>
>
>
> require(genefilter)
> con<- url('http://rdf.ait.ac.at/attachments/download/102/test.es')
> load(con)
> close(con)
>
> # the expressionset test.es contains two probes for the ACT3 genes, where A_23_P160354 is the one with the highest IQR.
> exprs(test.es)[fData(test.es)$ENTREZ %in% 10000, ]
> apply(exprs(test.es)[fData(test.es)$ENTREZ %in% 10000, ], 1, IQR)
>
> #However, if I apply nsFilter, the other Probe is kept
> remDup<- nsFilter(test.es, var.filter=F)$eset
> featureNames(remDup)[fData(remDup)$ENTREZ %in% 10000]
> exprs(remDup)[fData(remDup)$ENTREZ %in% 10000]
>
>
>
>   -- output of sessionInfo():
>
>> sessionInfo()
> R version 2.13.2 (2011-09-30)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
> [4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] HsAgilentDesign026652.db_2.5.0 org.Hs.eg.db_2.5.0             RSQLite_0.10.0                 DBI_0.2-5
> [5] AnnotationDbi_1.16.11          genefilter_1.34.0              Biobase_2.12.2
>
> loaded via a namespace (and not attached):
> [1] annotate_1.30.1 IRanges_1.10.6  splines_2.13.2  survival_2.36-9 tools_2.13.2    xtable_1.6-0
>>
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list