[BioC] Building a TranscriptDb object from a lncRNA gft file

Marc Carlson mcarlson at fhcrc.org
Fri Feb 15 21:42:27 CET 2013


Hi Fong,

I downloaded this file.  And after removing the very 1st line of the 
file (which was only "##gtf" ), this ran fine for me.  Did you look at 
the file before you ran it and see the line of cruft at the top?  Also 
what was your sessionInfo()?


  Marc


On 02/14/2013 12:54 PM, Fong Chun Chan wrote:
> Hi,
>
> I am trying to use the makeTranscriptDbFromGFF() function from the
> GenomicFeatures R package to build a transcriptDB from a lncRNA gff file
> available from http://www.lncipedia.org/download (version 2.1).
>
> It gives this error when I run it:
>
> $>  lncRNADb<- makeTranscriptDbFromGFF(
> '~/share/references/lncipedia_2_1.gtf', format = 'gtf', dataSource ='
> http://www.lncipedia.org/', species = 'all' )
>
> extracting transcript information
> Estimating transcript ranges.
> Extracting gene IDs
> Processing splicing information for gtf file.
> Deducing exon rank from relative coordinates provided
> Prepare the 'metadata' data frame ... metadata: OK
> Now generating chrominfo from available sequence names. No chromosome
> length information is available.
> Error in .normargSplicings(splicings, transcripts_tx_id) :
>    'splicings$cds_start' must be an integer vector
> In addition: Warning messages:
> 1: In .deduceExonRankings(exs, format = "gtf") :
>    Infering Exon Rankings.  If this is not what you expected, then please be
> sure that you have provided a valid attribute for exonRankAttributeName
> 2: In matchCircularity(chroms, circ_seqs) :
>    None of the strings in your circ_seqs argument match your seqnames.
>
> Has anyone encountered this error before? Any help would be greatly
> appreciated. Below is my sessionInfo(). Thanks,
>
> Fong
>
> ---
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] GenomicFeatures_1.10.1 AnnotationDbi_1.20.3   Biobase_2.18.0
> [4] GenomicRanges_1.10.6   IRanges_1.16.4         BiocGenerics_0.4.0
>
> loaded via a namespace (and not attached):
>   [1] biomaRt_2.14.0     Biostrings_2.26.3  bitops_1.0-5
> BSgenome_1.26.1
>   [5] DBI_0.2-5          parallel_2.15.2    RCurl_1.95-3
> Rsamtools_1.10.2
>   [9] RSQLite_0.11.2     rtracklayer_1.18.2 stats4_2.15.2      tools_2.15.2
>
> [13] XML_3.95-0.1       zlibbioc_1.4.0
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list