[BioC] Calculate Outer Positions of GappedAlignmentPairs

Hervé Pagès hpages at fhcrc.org
Wed Mar 20 19:32:36 CET 2013


Hi,

Right, forgot to provide a granges() method for GappedAlignmentPairs.
Makes sense to have one. Will do something like:

   rg <- range(grglist(x))
   if (!all(elementLengths(rg) == 1L))
     stop("For some pairs in 'x', the first and last alignments ",
          "are not aligned to the same chromosome and strand. "
          "Cannot extract a single range for them.")
   ans <- unlist(rg)
   ans

Will add this to GenomicRanges ASAP.

Note that, except for the error message (which is specific to paired-end
reads), the code above implements a kind of standard way to go from a
GRangesList object ('rg' in the code above) to a GRanges object ('ans')
by merging all the ranges within each top-level element of the
GRangesList into a single one. There is actually the same relationship
between the GRangesList and GRanges objects obtained with grglist(x)
and granges(x) on a GappedAlignments object. And we should have this
relationship again when 'x' is a GAlignmentsList. So I wonder if we
shouldn't make this transformation available as the coercion method
from GRangesList to GRanges (there is no such coercion at the moment),
and, in the man pages for grglist() and granges(), explain that
granges(x) is conceptually equivalent to as(grglist(x), "GRanges").

Cheers,
H.


On 03/20/2013 10:52 AM, Michael Lawrence wrote:
> I think that Dario was in fact aiming for a single range that spanned the
> entire alignment, including the gap between the ends, so a granges() method
> would make sense here. Looks like you're most of the way towards
> implementing it ;)
>
> Michael
>
>
>
> On Wed, Mar 20, 2013 at 10:05 AM, Valerie Obenchain <vobencha at fhcrc.org>wrote:
>
>> Hello,
>>
>> There are a couple of existing methods of GappedAlignmentPairs that may be
>> helpful.
>>
>> Using 'galp' from the man page as an example,
>>
>> example(GappedAlignmentPairs)
>>> class(galp)
>> [1] "GappedAlignmentPairs"
>> attr(,"package")
>> [1] "GenomicRanges"
>>> length(galp)
>> [1] 1572
>>
>> To get your furthest outer positions, pull out the start of the left-most
>> pair and the end of the right-most pair:
>>
>> outer <- cbind(start(left(galp)), end(right(galp)))
>>> head(outer)
>>       [,1] [,2]
>> [1,]   36  219
>> [2,]   49  259
>> [3,]   51  262
>> [4,]   60  259
>> [5,]   60  269
>> [6,]   61  282
>>
>> For coverage, the left and right pair can be extracted and coerced to a
>> GRAnges which retains introns in each pair as part of the range.
>>
>> grleft <- granges(left(galp))
>> grright <- granges(right(galp))
>>
>> If we implemented a granges,GappedAlignmentPairs it should return
>> something of the same length. To do this we'd have to merge the pair ranges
>> which would then include the space between the ranges which I don't think
>> we want. Right?
>>
>> As an fyi, another potentially useful function is introns(). This returns
>> a GRangesList of all intron ranges for each pair.
>>
>> introns <- introns(galp)
>>
>>
>> Valerie
>>
>>
>> On 03/19/2013 09:44 PM, Michael Lawrence wrote:
>>
>>> It would make sense for this to be granges,GappedAlignmentPairs. Want to
>>> contribute it?
>>>
>>> Michael
>>>
>>>
>>> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac <d.strbenac at garvan.org.au
>>>> **wrote:
>>>
>>>   Hello,
>>>>
>>>> I would like to calculate the furthest outer positions of the mapped read
>>>> pairs. That is, the lowest and highest mapped coordinate of a read. In a
>>>> later step, I want to get a coverage estimate that also includes coverage
>>>> of the spliced out intron parts. I couldn't find any function already in
>>>> GenomicRanges. I made my own code to create a GRanges object of these
>>>> simple ranges, but was curious about any API functionality I overlooked
>>>> that could do this.
>>>>
>>>> Thanks,
>>>>              Dario.
>>>>
>>>> ------------------------------**--------
>>>> Dario Strbenac
>>>> PhD Student
>>>> University of Sydney
>>>> Camperdown NSW 2050
>>>> Australia
>>>> ______________________________**_________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>
>>>>
>>>          [[alternative HTML version deleted]]
>>>
>>>
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives: http://news.gmane.org/gmane.**
>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>
>>>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list