[BioC] Question about Rle

Valerie Obenchain vobencha at fhcrc.org
Thu Jul 31 20:41:15 CEST 2014


Hi,

The standard Bioconductor container for sequences are the XString 
family. BA, DNA, RNA and AA types are supported. See the man page for 
details on subsetting and manipulation.

library(Biostrings)
?XString
?DNAString

You can represent a single sequence
dna <- DNAString("ATGTATCC")
>> dna
>   8-letter "DNAString" instance
> seq: ATGTATCC
>> dna[3]
>   1-letter "DNAString" instance
> seq: G

or a set of sequences.
dnast <- DNAStringSet(c("AT", "G", "ATC", "C"))
>> dnast
>   A DNAStringSet instance of length 4
>     width seq
> [1]     2 AT
> [2]     1 G
> [3]     3 ATC
> [4]     1 C

Methods for operating on sequences are built on the XString framework, 
not Rles. If you really wanted to represent a sequence as an Rle it 
would be a character Rle and constructed in the same way the other 
atomic types are. See ?Rle.

rle <- Rle(c("A", "T", "G", "T", "A", "T", "C"), c(rep(1, 6), 2))
>> rle
> character-Rle of length 8 with 7 runs
>   Lengths:   1   1   1   1   1   1   2
>   Values : "A" "T" "G" "T" "A" "T" "C"
>> rle[3]
> character-Rle of length 1 with 1 run
>   Lengths:   1
>   Values : "G"


Valerie


On 07/29/2014 04:04 AM, Asma rabe wrote:
> Hi All,
>
>
> I have a question DNA sequence encoding in Rle objects
>
>
> #-------------------------
>
>>From IRanges vignette
>
>
>   the sequence *{*1, 1, 1, 2, 3, 3*} * can be represented as values
>
> = *{*1, 2, 3*}* , run lengths = {3, 1, 2*}* .
>
>
> #---------------------------
>
> suppose we have a sequence of chr1 from pos 1-8     ATGTATCC. How it can be
> as Rle object Rle the so that it is recognized at pos 3 for instance the
> nuceoyide is G for comparison with reference genome?
>
>
> Thank you very much in advance.
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: vobencha at fhcrc.org
Phone: (206) 667-3158



More information about the Bioconductor mailing list