[BioC] [Bioc-sig-seq] extract id from ShortRead

Sean Davis seandavi at gmail.com
Mon Nov 30 16:29:03 CET 2009


On Mon, Nov 30, 2009 at 10:20 AM, Ramzi TEMANNI <ramzi.temanni at gmail.com> wrote:
> Hi Sean,
> Thanks for your help as.character give 2 ids by row:
> [1] "HWI-EA332_8_1_3_659#GGGGNN/1"  "HWI-EA332_8_1_3_1738#CCCCNN/1"
> [3] "HWI-EA332_8_1_3_1094#AGGANN/1" "HWI-EA332_8_1_3_558#TTTCNN/1"
> [5] "HWI-EA332_8_1_3_1920#AAAANN/1" "HWI-EA332_8_1_3_228#GGGGNN/1"
> it should be some accessors to extract one Id by row.
> i've take a look at suggested help but there's no useful info to extract
> what I want

This is NOT two ids per row.  It is one vector.  R outputs two
elements per row only because your screen is wide enough for that.

If you do:

tmp <- as.character(id(aln))

class(tmp)

tmp[1]
tmp[2]
tmp[1:5]
length(tmp)

it might give you more of an idea what is going on.

Sean



>
> On Mon, Nov 30, 2009 at 3:40 PM, Sean Davis <seandavi at gmail.com> wrote:
>>
>> On Mon, Nov 30, 2009 at 9:27 AM, Ramzi TEMANNI <ramzi.temanni at gmail.com>
>> wrote:
>> > Hi,
>> > I have a sequence loaded from bowtie alignment
>> > aln <- readAligned("./S1", pattern="S1_1.hg19.bowtie.align",
>> > type="Bowtie")
>> > I would like to to extract the id to select specific reads
>> > I run id(aln) and I get:
>> > id(aln)
>> >  A BStringSet instance of length 4340867
>> >          width seq
>> >      [1]    28 HWI-EA332_8_1_3_659#GGGGNN/1
>> >      [2]    29 HWI-EA332_8_1_3_1738#CCCCNN/1
>> >      [3]    29 HWI-EA332_8_1_3_1094#AGGANN/1
>> >      [4]    28 HWI-EA332_8_1_3_558#TTTCNN/1
>> >      [5]    29 HWI-EA332_8_1_3_1920#AAAANN/1
>> >      [6]    28 HWI-EA332_8_1_3_228#GGGGNN/1
>> >      [7]    29 HWI-EA332_8_1_3_1261#AGGGNN/1
>> >      [8]    28 HWI-EA332_8_1_3_908#ACTTNN/1
>> >      [9]    27 HWI-EA332_8_1_3_53#CTGCNN/1
>> >      ...   ... ...
>> > [4340859]    33 HWI-EA332_8_120_1596_499#TTGANA/1
>> > [4340860]    34 HWI-EA332_8_120_1599_1161#CCACNT/1
>> > [4340861]    33 HWI-EA332_8_120_1601_255#CTCTNA/1
>> > [4340862]    33 HWI-EA332_8_120_1601_504#CCATNC/1
>> > [4340863]    33 HWI-EA332_8_120_1624_899#CTCTNT/1
>> > [4340864]    33 HWI-EA332_8_120_1487_658#ACCCNA/1
>> > [4340865]    32 HWI-EA332_8_120_1533_28#CACANG/1
>> > [4340866]    33 HWI-EA332_8_120_1564_807#CCCGNG/1
>> > [4340867]    34 HWI-EA332_8_120_1474_1350#CCTGNC/1
>> >
>> > This BStringSet instance has 'width' and 'seq'
>> > runing str(id(aln)) i got this
>> >
>> > Formal class 'BStringSet' [package "Biostrings"] with 5 slots
>> >  ..@ pool           :Formal class 'SharedRaw_Pool' [package "IRanges"]
>> > with
>> > 2 slots
>> >  .. .. ..@ xp_list                    :List of 1
>> >  .. .. .. ..$ :<externalptr>
>> >  .. .. ..@ .link_to_cached_object_list:List of 1
>> >  .. .. .. ..$ :<environment: 0x2af6400>
>> >  ..@ ranges         :Formal class 'GroupedIRanges' [package "IRanges"]
>> > with
>> > 7 slots
>> >  .. .. ..@ group          : int [1:4340867] 1 1 1 1 1 1 1 1 1 1 ...
>> >  .. .. ..@ start          : int [1:4340867] 1 29 58 87 115 144 172 201
>> > 229
>> > 256 ...
>> >  .. .. ..@ width          : int [1:4340867] 28 29 29 28 29 28 29 28 27
>> > 29
>> > ...
>> >  .. .. ..@ NAMES          : NULL
>> >  .. .. ..@ elementMetadata: NULL
>> >  .. .. ..@ elementType    : chr "integer"
>> >  .. .. ..@ metadata       : list()
>> >  ..@ elementMetadata: NULL
>> >  ..@ elementType    : chr "BString"
>> >  ..@ metadata       : list()
>> >
>> > But i'm wondering how to extract only the 'seq' from all that and store
>> > result in a table ?
>>
>> as.character(id(aln))
>>
>> will return a character vector of the names.  You might want to look
>> at the help for AlignedRead-class and BStringSet-class for some help
>> in understanding these classes and what can be done with them.  It may
>> be that you will not need to go to character vector to do what you
>> want with the reads.
>>
>> Sean
>
>



More information about the Bioconductor mailing list