[BioC] wishlist for readGappedAlignments

Hervé Pagès hpages at fhcrc.org
Wed Aug 10 19:44:01 CEST 2011


Thanks Tengfei and Cory for the feedback. Those requests make
a lot of sense and we agree that the GappedAlignments container
will benefit a lot from this kind of improvements. We'll do our
best to implement most of them in one way or another before the
next release.

Cheers,
H.


On 11-08-09 01:50 PM, Martin Morgan wrote:
> On 08/09/2011 11:58 AM, Cory Barr wrote:
>> I would find being able to pass the "what" component of a ScanBamParam
>> object to readBamGappedAlignments very helpful. Like Tengfei, I often
>> read
>> in a BAM file from readBamGappedAlignments and also scanBam then
>> combine the
>> information.
>
> As a start, GappedAlignments() in 1.5.23 has a new ... argument used to
> populate elementMetadata. So e.g.,
>
> bam <- scanBam(<...>)
> with(bam[[1]], GappedAlignments(<...>, qual=qual))
>
> Martin
>
>> Being able to maintain information on a read's mate via
>> readBamGappedAlignments would also get much use from me. Currently, to do
>> this I combine information from scanBam, parse out the end number from
>> the
>> BAM flag, and then regroup a GRangesList to include its mate. Doing this
>> efficiently by passing an argument to grglist would be great.
>>
>> -Cory
>>
>> On Tue, Aug 9, 2011 at 11:33 AM, Tengfei Yin<yintengfei at gmail.com> wrote:
>>
>>> Dear all,
>>>
>>> I am using GenomicRanges and Rsamtools a lot for my work, they are
>>> extremely
>>> helpful and neat packages to deal with NGS data, thanks a lot for those
>>> people how contribute to all those nice packages in BioC. I just have
>>> some
>>> features request for the GappedAlignments, probably it's already
>>> there or
>>> it's not a good practice to do it in certain way, please feel free to
>>> let
>>> me
>>> know.
>>>
>>> I like features from both scanBam or readBamGappedAlignments, just
>>> sometime
>>> I need to write my own script trying to combine information from
>>> those two
>>> function and make a "general" granges to work with. So I am wondering if
>>> there is any way to do it in a neat way or is there a plan to implement
>>> similiar features?
>>>
>>> - Including more element meta data with GappedAlignments
>>> - there is "which" in readBamGappedAlignments, can I have some thing
>>> like "param" or "what" to get more info from bam file and associate
>>> them
>>> with Gapped reads.
>>> - When doing the coerce from GappedAlignement to GRanges, or call
>>> granges() on GappedAlignments object, it only return the minimal
>>> information, "qwidth", "cigar", "ngap" is not included as
>>> elementMetadata.
>>> - Including more pairing information for pair-end RNA-seq
>>> - So I could know the mated information with certain gapped reads,
>>> either plot it as pair-end read or do some computation on it.
>>> - Setting flags for each entry, so I can filter it out based on the
>>> flags, something like from scanBamFlag?
>>> - grglist to transform the data in different way
>>>
>>> If I can get a general data structure which combine all those
>>> information
>>> and or features together, that would be nice, I realize it's hard to
>>> combine all information together and make it flexible at the same time ,
>>> e.g. you need to deal with how to binding element meta data for paired
>>> entry, probably showing seq1/seq2 to indicate which sequence it's
>>> belongs
>>> too? how to handle multiple hits?
>>>
>>> Right now, I am making my own "giant" GRanges object which including all
>>> the
>>> information I want, but that's too specific for my work, that's why I am
>>> wondering if there is any plan to combine those neat features
>>> together and
>>> bring a more flexible data structure.
>>>
>>> Thanks!
>>>
>>> Tengfei
>>>
>>>
>>> --
>>> Tengfei Yin
>>> MCDB PhD student
>>> 1620 Howe Hall, 2274,
>>> Iowa State University
>>> Ames, IA,50011-2274
>>> Homepage: www.tengfei.name
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list