[BioC] replace nucleotide at fixed position in a DNAStringSet object

Robert Castelo robert.castelo at upf.edu
Tue Sep 17 16:54:02 CEST 2013


Hervé, Valerie,

this function replaceAt() is doing exactly what i was looking for, 
thanks a lot!!!!

robert.
ps: FYI, the current version at SVN (.18) seems to break with the 
instruction below y <- .. but after checking out the very latest .19 the 
example works.


On 09/16/2013 08:40 PM, Hervé Pagès wrote:
> Hi guys,
>
> With Bioc-devel, you can use replaceAt() for this:
>
> x <- DNAStringSet(c("ATGACCACG", "ACTGGGGAA", "GCCGATGCG"))
> y <- DNAStringSetList(DNAStringSet("G"), DNAStringSet("C"),
> DNAStringSet("C"))
>
> Then:
>
>  > replaceAt(x, IRanges(4, 4), y)
> A DNAStringSet instance of length 3
> width seq
> [1] 9 ATGGCCACG
> [2] 9 ACTCGGGAA
> [3] 9 GCCCATGCG
>
> An important clarification: An XString or XStringSet object is not more
> immutable than a character vector or an R object in general in the
> sense that we are not supposed to modify it *in-place*, except in some
> particular situations where we know it's safe to do so. When it's not
> safe to do so, then the object (or part of it) is copied and the copy
> is modified. Of course all this is transparent to the end-user who
> should never need to worry about whether it is safe or not to call [<-,
> [[<- or replaceAt() on his/her DNAStringSet object: copies are made
> if needed so those operations are always safe.
>
> Cheers,
> H.
>
>
>
> On 09/16/2013 09:46 AM, Valerie Obenchain wrote:
>> Hi,
>>
>> On 09/13/2013 07:13 AM, Robert Castelo wrote:
>>> hi!!
>>>
>>> i'd like to know if there is some efficient way to replace a nucleotide
>>> at a fixed position in a DNAStringSet object.
>>>
>>> let's say we have the following toy DNAStringSet object with 3 DNA
>>> sequences:
>>>
>>> x <- DNAStringSet(c("ATGACCACG", "ACTGGGGAA", "GCCGATGCG"))
>>> x
>>> A DNAStringSet instance of length 3
>>> width seq
>>> [1] 9 ATGACCACG
>>> [2] 9 ACTGGGGAA
>>> [3] 9 GCCGATGCG
>>>
>>> and a DNAStringSetList object with the following 3 nucleotides
>>>
>>> y <- DNAStringSetList(DNAStringSet("G"), DNAStringSet("C"),
>>> DNAStringSet("C"))
>>> y
>>> DNAStringSetList of length 3
>>> [[1]] G
>>> [[2]] C
>>> [[3]] C
>>>
>>> i'd like to replace the, let's say, fourth nucleotide along the DNA
>>> sequences in 'x' by those in 'y'. i can imagine how to do it coercing
>>> back and forth to character and so on but i guess there must be some
>>> more efficient way to do it.
>>
>> I don't think so. XString objects are immutable. The data are accessed
>> through an external pointer to an environment where they are
>> written/stored as raw. To subset/replace positions in 'x' with values
>> from 'y' you would need to go through the 'as.character' conversion and
>> create a new DNAStringSet.
>>
>> I've cc Herve in case I've gotten this wrong or he has a different
>> solution to the problem.
>>
>> Valerie
>>
>>
>>
>>
>> my interest come from the fact that the
>>> DNAStringSet object i have to work with can have many DNA sequences.
>>>
>>> thanks!!
>>> robert.
>>>
>>
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioconductor mailing list