[BioC] TXNAME mapping

James W. MacDonald jmacdon at uw.edu
Sat Jun 22 04:36:54 CEST 2013


Hi Murli,

I think you will need to show a small example script that gives this 
result. I see only one region that corresponds to that TXNAME:

 > x <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, use.names=T)
 > x["uc003ytw.3"]
GRangesList of length 1:
$uc003ytw.3
GRanges with 48 ranges and 3 metadata columns:
        seqnames                 ranges strand   |   exon_id   exon_name
<Rle> <IRanges> <Rle>   | <integer> <character>
    [1]     chr8 [133879205, 133879312]      +   |    116041 <NA>
    [2]     chr8 [133880360, 133880468]      +   |    116042 <NA>
    [3]     chr8 [133881974, 133882071]      +   |    116043 <NA>
    [4]     chr8 [133883593, 133883796]      +   |    116044 <NA>
    [5]     chr8 [133885307, 133885466]      +   |    116045 <NA>
    ...      ...                    ...    ... ...       ...         ...
   [44]     chr8 [134125666, 134125847]      +   |    116085 <NA>
   [45]     chr8 [134128853, 134128960]      +   |    116086 <NA>
   [46]     chr8 [134144056, 134144190]      +   |    116087 <NA>
   [47]     chr8 [134145714, 134145904]      +   |    116088 <NA>
   [48]     chr8 [134146920, 134147143]      +   |    116089 <NA>
        exon_rank
<integer>
    [1]         1
    [2]         2
    [3]         3
    [4]         4
    [5]         5
    ...       ...
   [44]        44
   [45]        45
   [46]        46
   [47]        47
   [48]        48

 > select(Homo.sapiens, "uc003ytw.3", c("TXID","GENEID","CHR", 
"CHRLOC","CHRLOCEND"), "TXNAME")
       TXNAME GENEID  TXID CHR    CHRLOC CHRLOCCHR CHRLOCEND
1 uc003ytw.3   7038 32071   8 133879205         8 134147143

Best,

Jim



On 6/21/2013 10:16 PM, Murli [guest] wrote:
> Hi,
>
> I am annotating my reads using TxDb.Hsapiens.UCSC.hg19.knownGene and org.Hs.eg.db. I am able to get everything work and also merge the data, but when I reviewd the output I see that the same TXNAME is mapped to different locations. See part of the output below. TXNAME uc003ytw.3 is associated with chr8  13515402  13515702   301 and  chr12  71612488  71612788   301.  I thought it should be unique, I would appreciate if you could correct me if I am missing something in understanding TXNAME.
>
> Thanks ../Murli
>
>
>
>
>> mrg.data[1000:1100,]
>        TXID GENEID     TXNAME seqnames     start       end width strand
> 1000 32071   7038 uc003ytw.3     chr8  13515402  13515702   301      *
> 1001 68728  63934 uc002qnd.3     chr8  14339379  14339679   301      *
> 1002 68729  63934 uc002qne.3     chr8  14339379  14339679   301      *
> 1003 68730  63934 uc010etm.3     chr8  14339379  14339679   301      *
> 1004 32071   7038 uc003ytw.3     chr8  14339379  14339679   301      *
> 1005 68728  63934 uc002qnd.3    chr12  71612488  71612788   301      *
> 1006 68729  63934 uc002qne.3    chr12  71612488  71612788   301      *
> 1007 68730  63934 uc010etm.3    chr12  71612488  71612788   301      *
> 1008 32071   7038 uc003ytw.3    chr12  71612488  71612788   301      *
> 1009 68728  63934 uc002qnd.3    chr14  24809972  24810272   301      *
> 1010 68729  63934 uc002qne.3    chr14  24809972  24810272   301      *
>
>
>
>
>   -- output of sessionInfo():
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>   [1] Homo.sapiens_1.1.1
>   [2] GO.db_2.9.0
>   [3] OrganismDbi_1.2.0
>   [4] org.Hs.eg.db_2.9.0
>   [5] RSQLite_0.11.4
>   [6] DBI_0.2-7
>   [7] VariantAnnotation_1.6.6
>   [8] Rsamtools_1.12.3
>   [9] BSgenome.Hsapiens.UCSC.hg19_1.3.19
> [10] BSgenome_1.28.0
> [11] Biostrings_2.28.0
> [12] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2
> [13] GenomicFeatures_1.12.2
> [14] AnnotationDbi_1.22.6
> [15] Biobase_2.20.0
> [16] GenomicRanges_1.12.4
> [17] IRanges_1.18.1
> [18] BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
>   [1] biomaRt_2.16.0     bitops_1.0-5       graph_1.38.2       RBGL_1.36.2
>   [5] RCurl_1.95-4.1     rtracklayer_1.20.2 stats4_3.0.1       tools_3.0.1
>   [9] XML_3.98-1.1       zlibbioc_1.6.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list