[BioC] nsFilter error in genefilter

Robert Castelo robert.castelo at upf.edu
Wed Apr 17 18:24:27 CEST 2013


Dear Zhenya,

from your output below i'd say that the problem arises when mapping 
identifiers between the genes in gene sets from the 'GeneSetCollection' 
object and the features from the 'ExpressionSet' object.

my guess is that the problematic instruction would be this line, i'm 
putting it here trying to use your variable names, please run it and 
paste here the result:

library(Biobase)
library(GSEABase)

mapped.gset.idx.list <- mapIdentifiers(gsc, 
AnnoOrEntrezIdentifier(annotation(EsetData_f)))

if this lines prompts an error, then in principle this is not as much of 
a problem of GSVA but of how the identifier mapping magic should be 
working, and maybe the maintainers of GSEABase can give you a hand there.

please also paste the output of traceback() right after you get the 
error and also the display of the objects 'gsc' and 'EsetData_f'


cheers,
robert.

On 04/17/2013 06:02 PM, Zhenya [guest] wrote:
>
> Hi All,
>
> I am trying to run the code for GSVA (library with the same name). The code is below, but the main error is around annotation:
>> source("http://bioconductor.org/biocLite.R")
> Bioconductor version 2.12 (BiocInstaller 1.10.0), ?biocLite for help
>> biocLite("hthgu133pluspm.db")
> BioC_mirror: http://bioconductor.org
> Using Bioconductor version 2.12 (BiocInstaller 1.10.0), R version 3.0.0.
> Installing package(s) 'hthgu133pluspm.db'
> Warning message:
> package ‘hthgu133pluspm.db’ is not available (for R version 3.0.0)
>
> Code:
>
> # CREATE GeneSetCollection
> library(GSEABase)
> x<- scan("GeneSets.gmt", what="", sep="\n")
> GeneSets.gmt<- strsplit(x, "[[:space:]]+")
> names(GeneSets.gmt)<- sapply(GeneSets.gmt, `[[`, 1)
> GeneSets.gmt<- lapply(GeneSets.gmt, `[`, -1)
> n<- names(GeneSets.gmt)
> uniqueList<- lapply(GeneSets.gmt, unique)
> makeSet<- function(geneIds, n) {GeneSet(geneIds, geneIdType=SymbolIdentifier(), setName=n)}
> gsList<- gsc<- mapply(makeSet, uniqueList[], n)
> gsc<- GeneSetCollection(gsList)
>
> # DATASET
> # CREATE ExpressionSet
> exprs<- as.matrix(read.table("ExprData.txt", header=TRUE, sep="\t", row.names=1, as.is=TRUE))
> pData<- read.table("DesignFile.txt",row.names=1, header=T,sep="\t")
> phenoData<- new("AnnotatedDataFrame",data=pData)
> annotation<- "hthgu133pluspm.db"
> EsetData<- ExpressionSet(assayData=exprs,phenoData=phenoData,annotation="hthgu133pluspm")
> head(ExprData)
>
> #Gene Filtering
> library(genefilter)
> library("hthgu133pluspm")
> filtered_eset<- nsFilter(EsetData, require.entrez=TRUE, remove.dupEntrez=TRUE, var.func=IQR, var.filter=FALSE, var.cutoff=0.25, filterByQuantile=TRUE, feature.exclude="^AFFX")
> # get stats for numbers of probesets removed
> filtered_eset
> EsetData_f<- filtered_eset$eset
>
> # GSVA
> library(GSVA)
> gsva_es<- gsva(EsetData_f,gsc,abs.ranking=FALSE,min.sz=1,max.sz=1000,mx.diff=TRUE)$es.obs
>
> I downloaded hthgu133pluspm from http://nmg-r.bioinformatics.nl/NuGO_R.html
> and R still complains. The available on Bioconductor:
> hthgu133pluspmprobe
> and
> hthgu133pluspmcdf
> are not correct and give error for nsFilter and gsva:
> Error in (function (classes, fdef, mtable)  :
>    unable to find an inherited method for function ‘cols’ for signature ‘"environment"’
>
> Mapping identifiers between gene sets and feature names
> Error in GeneSetCollection(lapply(what, mapIdentifiers, to, ..., verbose = verbose)) :
>    error in evaluating the argument 'object' in selecting a method for function 'GeneSetCollection': Error in (function (classes, fdef, mtable)  :
>    unable to find an inherited method for function ‘cols’ for signature ‘"environment"’
>
>
> Thank you,
> Zhenya
>
>   -- output of sessionInfo():
>
> R version 3.0.0 (2013-04-03)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>   [1] GSVA_1.8.0                 BiocInstaller_1.10.0       hthgu133pluspmprobe_2.12.0 hthgu133pluspmcdf_2.12.0   genefilter_1.42.0          GSEABase_1.22.0
>   [7] graph_1.38.0               annotate_1.38.0            AnnotationDbi_1.22.1       Biobase_2.20.0             BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
> [1] DBI_0.2-5       IRanges_1.18.0  RSQLite_0.11.2  splines_3.0.0   stats4_3.0.0    survival_2.37-4 tools_3.0.0     XML_3.96-1.1    xtable_1.7-1
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioconductor mailing list