[BioC] GRanges list and reduce function

Martin Morgan mtmorgan at fhcrc.org
Fri Aug 15 16:56:34 CEST 2014


On 08/15/2014 03:20 AM, Asma rabe wrote:
> Hi ,
>
>
> I need a Granges object with exons data for  few chromosomes, i got Granges
> list of transcripts and their exons as follows:
>
>
> library("TxDb.Hsapiens.UCSC.hg19.knownGene")
>
> txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene
>
> tx_Exons<-exonsBy(txdb)
>
>
>
> 1-How to use reduce on Granges list?how to get the unique exons only and
> exclude redundant exons?
>

I'm not sure what this means -- you've asked for exons grouped by transcript, 
and there are not 'extra' exons in each transcript. Did you want exonsBy(txdb, 
"gene") ?

reduce(tx_Exons) reduces within each transcript (list element); I'm not sure 
what you'd really like to do?

>
> 2-How to select exons of certain chromosomes only ex: chr10? i tried the
> following but i wonder why i got  GRnages list with empty Grange lists??

if you want to select transcripts where all exons are in certain chromosomes, 
note that

   seqnames(tx_Exonss) %in% "chr10"

returns an RleList, and

   all(seqnames(tx_Exons) %in% "chr10")

asks element-wise whether all elements of each Rle are TRUE, returning a logical 
vector of the same length as tx_Exons. So

   tx_Exons[all(seqnames(tx_Exons) %in% "chr10")]

returns the transcripts with all exons on chr10. For exons group by _gene_, it's 
possible that genes are annotated to contain exons from different chromosomes

> exByGn = exonsBy(txdb, "gene")
> table(elementLengths(runLength(seqnames(exByGn))))

     1     2     3     4     5     6     7     8
23182    77     4     3    19    38    76    60

and only exons in chr10, preserving grouping by gene and removing genes without 
any exons in chr10, are

> chr10 <- exByGn[seqnames(exByGn) %in% "chr10"]

this is what you did below. The result is not empty, just contains the many 
transcripts with exons not in chr10 removed, plus those deep in the list that 
are on chr10. Here I remove the elements without 0 elements.

> chr10[elementLengths(chr10) != 0]

Martin

>
>
> chr10<-tx_Exons[seqnames(tx_Exons)=="chr10",]
>
>
>> chr10
>
> GRangesList of length 80922:
>
> $1
>
> GRanges with 0 ranges and 3 metadata columns:
>
>     seqnames    ranges strand |   exon_id   exon_name exon_rank
>
>        <Rle> <IRanges>  <Rle> | <integer> <character> <integer>
>
>
> $2
>
> GRanges with 0 ranges and 3 metadata columns:
>
>       seqnames ranges strand | exon_id exon_name exon_rank
>
>
> $3
>
> GRanges with 0 ranges and 3 metadata columns:
>
>       seqnames ranges strand | exon_id exon_name exon_rank
>
>
> ...
>
> <80919 more elements>
>
> ---
>
> seqlengths:
>
>                    chr1                  chr2 ...        chrUn_gl000249
>
>               249250621             243199373 ...                 38502
>
>
>
>> length(chr10)
>
> [1] 80922
>
>> length(tx_Exons)
>
> [1] 80922
>
>
> Thank you
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list