[BioC] creating a UniprotIdentifier within GSEABase

Martin Morgan mtmorgan at fhcrc.org
Sun Aug 7 21:31:00 CEST 2011


On 08/03/2011 12:03 AM, Mark Cowley wrote:
> Dear list,
> Within the GSEABase package, there are various types of identifier classes (eg GenenameIdentifier, EntrezIdentifier,...) and i'd like to create a UniprotIdentifier, since it does not already exist, and ultimately i'd like to convert between accession types using org.Hs.eg.db and mapIdentifiers function.
>
> Following the hint within GeneIdentifierType-class.Rd, i've added a custom UniprotIdentifier class:

Thanks Mark we'll try to incorporate this. Martin

> setClass("UniprotIdentifier",
>           contains="GeneIdentifierType",
>           prototype=prototype(
>             type=new("ScalarCharacter", "Uniprot")))
>
> # but the default constructor in that hint is too simple&  does not support the annotation argument which is crucial for mapping between GeneSet types:
> UniprotIdentifier<- function() new("UniprotIdentifier")
> UniprotIdentifier()
> # geneIdType: Uniprot
>
> # make a geneset
> up<- c("Q9Y6Q1", "A6NJZ7", "Q9BXI6", "Q15035", "A1X283", "P55957")
> gs<- GeneSet(up, geneIdType=UniprotIdentifier())
> gs
> # setName: NA
> # geneIds: Q9Y6Q1, A6NJZ7, ..., P55957 (total: 6)
> # geneIdType: Null
> # collectionType: Null
> # details: use 'details(object)'
>
> # try to map to EntrezIdentifier
> mapIdentifiers(gs, EntrezIdentifier())
> Error in .mapIdentifiers_isMappable(from, to) :
>    unable to map from 'Uniprot' to 'EntrezId'
>      neither GeneIdentifierType has annotation
>
>
> Given that the GenenameIdentifier constructor looks like this:
> GenenameIdentifier<- function (annotation, ...) {
>      args<- names(match.call())[-1]
>      GSEABase:::.checkRequired(NULL, args)
>      miss<- "annotation"[!"annotation" %in% args]
>      oargs<- list(annotation = annotation, ... = ...)[!names(list(
>          annotation = annotation, ... = ...)) %in% miss]
>      do.call(new, c("GenenameIdentifier", oargs))
> }
> #<environment: 0x1052a1308>
> GenenameIdentifier()
> # geneIdType: Genename
>
> # why does my constructor not work?
> UniprotIdentifier<- function (annotation, ...)  {
>      args<- names(match.call())[-1]
>      GSEABase:::.checkRequired(NULL, args)
>      miss<- "annotation"[!"annotation" %in% args]
>      oargs<- list(annotation = annotation, ... = ...)[!names(list(
>          annotation = annotation, ... = ...)) %in% miss]
>      do.call(new, c("UniprotIdentifier", oargs))
> }
> UniprotIdentifier()
> # Error in list(annotation = annotation, ... = ...) :
> #   'annotation' is missing
>
> # FYI: Here's a constructor that does work, but it's a lot different to the ones provided by GSEABase class.
> UniprotIdentifier<- function (annotation, ...) {
> 	args<- names(match.call())[-1]
> 	oargs<- if(missing(annotation)) list(... = ...) else list(annotation = annotation, ... = ...)
> 	do.call(new, c("UniprotIdentifier", oargs))
> }
> UniprotIdentifier()
> # geneIdType: Uniprot
> UniprotIdentifier(annotation="org.Hs.eg.db")
> # geneIdType: Uniprot (org.Hs.eg.db)
>
> # With that final constructor, I can now convert from UniProt Id's to other types of ID:
> gs<- GeneSet(up)
> geneIdType(gs)<- UniprotIdentifier()
> gs<- GeneSet(up, geneIdType=UniprotIdentifier())
> gs<- GeneSet(up, geneIdType=UniprotIdentifier(annotation="org.Hs.eg.db"), setName="testA")
>
> mapIdentifiers(gs, EntrezIdentifier())
> # setName: testA
> # geneIds: 827, 83874, ..., 637 (total: 5)
> # geneIdType: EntrezId (org.Hs.eg.db)
> # collectionType: Null
> # details: use 'details(object)'
>
> cheers,
> Mark
> -----------------------------------------------------
> Mark Cowley, PhD
>
> Pancreatic Cancer Program | Peter Wills Bioinformatics Centre
> Garvan Institute of Medical Research, Sydney, Australia
> -----------------------------------------------------
>
>
> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] hgu95av2.db_2.4.5    org.Hs.eg.db_2.4.6   RSQLite_0.9-4
> [4] DBI_0.2-5            GSEABase_1.12.2-1    graph_1.28.0
> [7] annotate_1.28.0      AnnotationDbi_1.12.0 Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] tools_2.12.2 XML_3.2-0    xtable_1.5-6
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list