[BioC] Calculate Outer Positions of GappedAlignmentPairs

Hervé Pagès hpages at fhcrc.org
Wed Mar 20 20:00:39 CET 2013


On 03/20/2013 11:32 AM, Hervé Pagès wrote:
> Hi,
>
> Right, forgot to provide a granges() method for GappedAlignmentPairs.
> Makes sense to have one. Will do something like:
>
>    rg <- range(grglist(x))
>    if (!all(elementLengths(rg) == 1L))
>      stop("For some pairs in 'x', the first and last alignments ",
>           "are not aligned to the same chromosome and strand. "
>           "Cannot extract a single range for them.")
>    ans <- unlist(rg)
>    ans
>
> Will add this to GenomicRanges ASAP.

Added to GenomicRanges 1.11.39 (devel).

Cheers,
H.

>
> Note that, except for the error message (which is specific to paired-end
> reads), the code above implements a kind of standard way to go from a
> GRangesList object ('rg' in the code above) to a GRanges object ('ans')
> by merging all the ranges within each top-level element of the
> GRangesList into a single one. There is actually the same relationship
> between the GRangesList and GRanges objects obtained with grglist(x)
> and granges(x) on a GappedAlignments object. And we should have this
> relationship again when 'x' is a GAlignmentsList. So I wonder if we
> shouldn't make this transformation available as the coercion method
> from GRangesList to GRanges (there is no such coercion at the moment),
> and, in the man pages for grglist() and granges(), explain that
> granges(x) is conceptually equivalent to as(grglist(x), "GRanges").
>
> Cheers,
> H.
>
>
> On 03/20/2013 10:52 AM, Michael Lawrence wrote:
>> I think that Dario was in fact aiming for a single range that spanned the
>> entire alignment, including the gap between the ends, so a granges()
>> method
>> would make sense here. Looks like you're most of the way towards
>> implementing it ;)
>>
>> Michael
>>
>>
>>
>> On Wed, Mar 20, 2013 at 10:05 AM, Valerie Obenchain
>> <vobencha at fhcrc.org>wrote:
>>
>>> Hello,
>>>
>>> There are a couple of existing methods of GappedAlignmentPairs that
>>> may be
>>> helpful.
>>>
>>> Using 'galp' from the man page as an example,
>>>
>>> example(GappedAlignmentPairs)
>>>> class(galp)
>>> [1] "GappedAlignmentPairs"
>>> attr(,"package")
>>> [1] "GenomicRanges"
>>>> length(galp)
>>> [1] 1572
>>>
>>> To get your furthest outer positions, pull out the start of the
>>> left-most
>>> pair and the end of the right-most pair:
>>>
>>> outer <- cbind(start(left(galp)), end(right(galp)))
>>>> head(outer)
>>>       [,1] [,2]
>>> [1,]   36  219
>>> [2,]   49  259
>>> [3,]   51  262
>>> [4,]   60  259
>>> [5,]   60  269
>>> [6,]   61  282
>>>
>>> For coverage, the left and right pair can be extracted and coerced to a
>>> GRAnges which retains introns in each pair as part of the range.
>>>
>>> grleft <- granges(left(galp))
>>> grright <- granges(right(galp))
>>>
>>> If we implemented a granges,GappedAlignmentPairs it should return
>>> something of the same length. To do this we'd have to merge the pair
>>> ranges
>>> which would then include the space between the ranges which I don't
>>> think
>>> we want. Right?
>>>
>>> As an fyi, another potentially useful function is introns(). This
>>> returns
>>> a GRangesList of all intron ranges for each pair.
>>>
>>> introns <- introns(galp)
>>>
>>>
>>> Valerie
>>>
>>>
>>> On 03/19/2013 09:44 PM, Michael Lawrence wrote:
>>>
>>>> It would make sense for this to be granges,GappedAlignmentPairs.
>>>> Want to
>>>> contribute it?
>>>>
>>>> Michael
>>>>
>>>>
>>>> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac
>>>> <d.strbenac at garvan.org.au
>>>>> **wrote:
>>>>
>>>>   Hello,
>>>>>
>>>>> I would like to calculate the furthest outer positions of the
>>>>> mapped read
>>>>> pairs. That is, the lowest and highest mapped coordinate of a read.
>>>>> In a
>>>>> later step, I want to get a coverage estimate that also includes
>>>>> coverage
>>>>> of the spliced out intron parts. I couldn't find any function
>>>>> already in
>>>>> GenomicRanges. I made my own code to create a GRanges object of these
>>>>> simple ranges, but was curious about any API functionality I
>>>>> overlooked
>>>>> that could do this.
>>>>>
>>>>> Thanks,
>>>>>              Dario.
>>>>>
>>>>> ------------------------------**--------
>>>>> Dario Strbenac
>>>>> PhD Student
>>>>> University of Sydney
>>>>> Camperdown NSW 2050
>>>>> Australia
>>>>> ______________________________**_________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>>>
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>>
>>>>>
>>>>>
>>>>          [[alternative HTML version deleted]]
>>>>
>>>>
>>>> ______________________________**_________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>>
>>>> Search the archives: http://news.gmane.org/gmane.**
>>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>
>>>>
>>>>
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list