[BioC] GRanges list and reduce function
Martin Morgan
mtmorgan at fhcrc.org
Fri Aug 15 16:56:34 CEST 2014
On 08/15/2014 03:20 AM, Asma rabe wrote:
> Hi ,
>
>
> I need a Granges object with exons data for few chromosomes, i got Granges
> list of transcripts and their exons as follows:
>
>
> library("TxDb.Hsapiens.UCSC.hg19.knownGene")
>
> txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene
>
> tx_Exons<-exonsBy(txdb)
>
>
>
> 1-How to use reduce on Granges list?how to get the unique exons only and
> exclude redundant exons?
>
I'm not sure what this means -- you've asked for exons grouped by transcript,
and there are not 'extra' exons in each transcript. Did you want exonsBy(txdb,
"gene") ?
reduce(tx_Exons) reduces within each transcript (list element); I'm not sure
what you'd really like to do?
>
> 2-How to select exons of certain chromosomes only ex: chr10? i tried the
> following but i wonder why i got GRnages list with empty Grange lists??
if you want to select transcripts where all exons are in certain chromosomes,
note that
seqnames(tx_Exonss) %in% "chr10"
returns an RleList, and
all(seqnames(tx_Exons) %in% "chr10")
asks element-wise whether all elements of each Rle are TRUE, returning a logical
vector of the same length as tx_Exons. So
tx_Exons[all(seqnames(tx_Exons) %in% "chr10")]
returns the transcripts with all exons on chr10. For exons group by _gene_, it's
possible that genes are annotated to contain exons from different chromosomes
> exByGn = exonsBy(txdb, "gene")
> table(elementLengths(runLength(seqnames(exByGn))))
1 2 3 4 5 6 7 8
23182 77 4 3 19 38 76 60
and only exons in chr10, preserving grouping by gene and removing genes without
any exons in chr10, are
> chr10 <- exByGn[seqnames(exByGn) %in% "chr10"]
this is what you did below. The result is not empty, just contains the many
transcripts with exons not in chr10 removed, plus those deep in the list that
are on chr10. Here I remove the elements without 0 elements.
> chr10[elementLengths(chr10) != 0]
Martin
>
>
> chr10<-tx_Exons[seqnames(tx_Exons)=="chr10",]
>
>
>> chr10
>
> GRangesList of length 80922:
>
> $1
>
> GRanges with 0 ranges and 3 metadata columns:
>
> seqnames ranges strand | exon_id exon_name exon_rank
>
> <Rle> <IRanges> <Rle> | <integer> <character> <integer>
>
>
> $2
>
> GRanges with 0 ranges and 3 metadata columns:
>
> seqnames ranges strand | exon_id exon_name exon_rank
>
>
> $3
>
> GRanges with 0 ranges and 3 metadata columns:
>
> seqnames ranges strand | exon_id exon_name exon_rank
>
>
> ...
>
> <80919 more elements>
>
> ---
>
> seqlengths:
>
> chr1 chr2 ... chrUn_gl000249
>
> 249250621 243199373 ... 38502
>
>
>
>> length(chr10)
>
> [1] 80922
>
>> length(tx_Exons)
>
> [1] 80922
>
>
> Thank you
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list