[BioC] hugene10sttranscriptclusterACCNUM has no mappings

Fri Aug 29 16:15:44 CEST 2014

Hi Thomas,

I built that package, and as you note, there are no accession numbers. But
maybe that is because I misunderstand something, so I am directly including
Marc Carlson in this conversation.

Since the annotation packages are Gene ID-centric, I create two files, one
with probeid->GeneID, and one with probeid->GeneBank/RefSeq ID. I then use
the first file as the primary annotation file, and the second as the
'otherSrc' file. If I then run makeDBPackage(), I get this output:

baseMapType is eg
Prepending Metadata
Creating Genes table
Appending Probes
Found 0 Probe Accessions
Appending Gene Info
Found 19962 Gene Names
Found 19962 Gene Symbols
<snip>

But if I then reverse the source files, using the second file as the
primary annotation file, and the GeneID file as the 'otherSrc' file, I get:

baseMapType is gb or gbNRef
Prepending Metadata
Creating Genes table
Appending Probes
Found 21941 Probe Accessions
Appending Gene Info
Found 20195 Gene Names
Found 20195 Gene Symbols
<snip>

>From my understanding of the SQLForge vignette, I should be able to use
either ordering, and get identical results, but obviously this is not the
case. Marc, can you shed some light on this? Evidently I should re-make the
packages using gbNRef rather than eg as the baseMapType.

Best,

Jim

On Fri, Aug 29, 2014 at 4:30 AM, Thomas Pfau <thomas.pfau at uni.lu> wrote:

> Hello,
>
> I just tried to get a probe to accession matching the above annotation
> database. In particular it does not yield any mappings for accessions. (i.e.
> x <- hugene10sttranscriptclusterACCNUM
> mapped_probes <- mappedkeys(x)
> yields an empty mapped_probes list.
>
>
> I'm Running R 3.1.1 on ubuntu.
> The loaded packages are:
>
>  [1] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0
>  [4] IRanges_1.22.10 oligoClasses_1.26.0 hugene10sttranscriptcluster.
> db_8.1.0
>  [7] org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7
> [10] AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0
> [13] BiocGenerics_0.10.0 BiocInstaller_1.14.2
>
> and capture.output(hugene10sttranscriptcluster()) yields:
>  [1] "Quality control information for hugene10sttranscriptcluster:"
>  [2] ""
>  [3] ""
>  [4] "This package has the following mappings:"
>  [5] ""
>  [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of 33297 keys)"
>  [7] "hugene10sttranscriptclusterALIAS2PROBE has 60778 mapped keys (of
> 103510 keys)"
>  [8] "hugene10sttranscriptclusterCHR has 19962 mapped keys (of 33297
> keys)"
>  [9] "hugene10sttranscriptclusterCHRLENGTHS has 93 mapped keys (of 93
> keys)"
> [10] "hugene10sttranscriptclusterCHRLOC has 19424 mapped keys (of 33297
> keys)"
> [11] "hugene10sttranscriptclusterCHRLOCEND has 19424 mapped keys (of
> 33297 keys)"
> [12] "hugene10sttranscriptclusterENSEMBL has 19416 mapped keys (of 33297
> keys)"
> [13] "hugene10sttranscriptclusterENSEMBL2PROBE has 20590 mapped keys (of
> 28046 keys)"
> [14] "hugene10sttranscriptclusterENTREZID has 19962 mapped keys (of 33297
> keys)"
> [15] "hugene10sttranscriptclusterENZYME has 2201 mapped keys (of 33297
> keys)"
> [16] "hugene10sttranscriptclusterENZYME2PROBE has 958 mapped keys (of 975
> keys)"
> [17] "hugene10sttranscriptclusterGENENAME has 19962 mapped keys (of 33297
> keys)"
> [18] "hugene10sttranscriptclusterGO has 17412 mapped keys (of 33297 keys)"
> [19] "hugene10sttranscriptclusterGO2ALLPROBES has 17930 mapped keys (of
> 18078 keys)"
> [20] "hugene10sttranscriptclusterGO2PROBE has 13970 mapped keys (of 14134
> keys)"
> [21] "hugene10sttranscriptclusterMAP has 19832 mapped keys (of 33297
> keys)"
> [22] "hugene10sttranscriptclusterOMIM has 13778 mapped keys (of 33297
> keys)"
> [23] "hugene10sttranscriptclusterPATH has 5768 mapped keys (of 33297
> keys)"
> [24] "hugene10sttranscriptclusterPATH2PROBE has 229 mapped keys (of 229
> keys)"
> [25] "hugene10sttranscriptclusterPFAM has 18146 mapped keys (of 33297
> keys)"
> [26] "hugene10sttranscriptclusterPMID has 19726 mapped keys (of 33297
> keys)"
> [27] "hugene10sttranscriptclusterPMID2PROBE has 396421 mapped keys (of
> 412133 keys)"
> [28] "hugene10sttranscriptclusterPROSITE has 18146 mapped keys (of 33297
> keys)"
> [29] "hugene10sttranscriptclusterREFSEQ has 19873 mapped keys (of 33297
> keys)"
> [30] "hugene10sttranscriptclusterSYMBOL has 19962 mapped keys (of 33297
> keys)"
> [31] "hugene10sttranscriptclusterUNIGENE has 19578 mapped keys (of 33297
> keys)"
> [32] "hugene10sttranscriptclusterUNIPROT has 18193 mapped keys (of 33297
> keys)"
> [33] ""
> [34] ""
> [35] "Additional Information about this package:"
> [36] ""
> [37] "DB schema: HUMANCHIP_DB"
> [38] "DB schema version: 2.1"
> [39] "Organism: Homo sapiens"
> [40] "Date for NCBI data: 2014-Mar13"
> [41] "Date for GO data: 20140308"
> [42] "Date for KEGG data: 2011-Mar15"
> [43] "Date for Golden Path data: 2010-Mar22"
> [44] "Date for Ensembl data: 2014-Feb26"
>
> It seems like something is broken there showing in line 4:
>  [6] "hugene10sttranscriptclusterACCNUM has 0 mapped keys (of 33297 keys)"
>
> Any ideas on how to solve this? Or whether this is a bug on my side or on
> the package side?
>
> Kind Regards
>
> Thomas
>
>
> --
> Université du Luxembourg
> Faculté des Sciences, de la Technologie et de la Communication
> Campus Limpertsberg, BRB 2.13
> 162a, avenue de la Faïencerie
> L-1511 Luxembourg
> Email: thomas.pfau at uni.lu
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.
> science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099

	[[alternative HTML version deleted]]