[BioC] retrieving annotation

Kathi Zarnack zarnack at ebi.ac.uk
Tue Nov 19 16:13:13 CET 2013


Hi Herve,

thanks for pointing me to the files and also for updating 
GenomicFeatures. It's a great package! I let you know if I run into any 
other problems.

Best regards,
Kathi


On 17/11/13 21:43, Hervé Pagès wrote:
> Hi Kathi,
>
> On 11/07/2013 05:11 AM, Kathi Zarnack wrote:
>> Hi,
>>
>> I wanted to ask whether any of the annotation packages contains
>> information on the transcript biotype (protein-coding, etc). I would
>> like to select only protein-coding isoforms from Ensembl annotation, but
>> I could not find any package that includes this information (otherwise I
>> will get it with biomaRt, I just wondered whether it is already included
>> somewhere).
>>
>> Also, I tried to download GENCODE annotation using GenomicFeatures, and
>> got the following error:
>>
>>  > test=makeTranscriptDbFromUCSC(genome="hg19",
>> tablename="wgEncodeGencodeManualV3")
>> Error in tableNames(ucscTableQuery(session, track = track)) :
>>    error in evaluating the argument 'object' in selecting a method for
>> function 'tableNames': Error in normArgTrack(track, trackids) : Unknown
>> track: Gencode Genes
>>
>> I tried to get the same table for hg18, but I get only one step further:
>>
>> test=makeTranscriptDbFromUCSC(genome="hg18",
>> tablename="wgEncodeGencodeManualV3")
>> Download the wgEncodeGencodeManualV3 table ... OK
>> Download the wgEncodeGencodeClassesV3 table ... Error in
>> normArgTable(value, x) :
>>    unknown table name 'wgEncodeGencodeClassesV3'
>
> Note that the wgEncodeGencodeManualV3 table seems to be for hg18
> only: there doesn't seem to be such table for hg19.
>
> For hg19, UCSC provides 3 GENCODE tracks: GENCODE Genes V17, GENCODE
> Genes V14, and GENCODE Genes V7. Each of them contains 5 tables
> that are compatible with makeTranscriptDbFromUCSC(). For example,
> for GENCODE Genes V17, those tables are:
>
>   wgEncodeGencodeBasicV17
>   wgEncodeGencodeCompV17
>   wgEncodeGencodePseudoGeneV17
>   wgEncodeGencode2wayConsPseudoV17
>   wgEncodeGencodePolyaV17
>
> See here for the details:
>
> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeGencodeSuper
>
> I just made some adjustments to the GenomicFeatures package so
> makeTranscriptDbFromUCSC() can work on those tables. Unfortunately
> I also needed to fix support for the wgEncodeGencode*V3 tables (for
> hg18) which was broken due to changes on the UCSC side.
>
> Those updates are in GenomicFeatures 1.14.2 (release) and 1.15.4
> (devel). Both should become available via biocLite() in the next 24
> hours or so.
>
> Please let us know if you run into any other problem with the
> GenomicFeatures package.
>
> Thanks,
> H.
>
>
>>
>> Thank you very much for your help,
>> Kathi
>>
>>
>> ------------------------------------------
>>
>>  > sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>>   [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>>   [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>>   [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>>   [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>> [8] base
>>
>> other attached packages:
>> [1] GenomicFeatures_1.14.0 AnnotationDbi_1.24.0 Biobase_2.22.0
>> [4] GenomicRanges_1.14.3   XVector_0.2.0 IRanges_1.20.5
>> [7] BiocGenerics_0.8.0     BiocInstaller_1.12.0
>>
>> loaded via a namespace (and not attached):
>>   [1] biomaRt_2.18.0     Biostrings_2.30.0  bitops_1.0-6 BSgenome_1.30.0
>>   [5] DBI_0.2-7          RCurl_1.95-4.1     Rsamtools_1.14.1 
>> RSQLite_0.11.4
>>   [9] rtracklayer_1.22.0 stats4_3.0.2       tcltk_3.0.2 tools_3.0.2
>> [13] XML_3.98-1.1       zlibbioc_1.8.0
>>
>>
>

-- 
Dr. Kathi Zarnack
Luscombe Group

European Molecular Biology Laboratory
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

email zarnack at ebi.ac.uk
tel +44 1223 494 526



More information about the Bioconductor mailing list