[BioC] oligonucleotideTransitions from Biostrings package

Hervé Pagès hpages at fhcrc.org
Tue Apr 3 19:10:24 CEST 2012


Hi Jose,

Good catch. Actually the problem I think is that 
oligonucleotideTransitions()
was not meant to be used on a DNAStringSet, only on a DNAString object,
but the man page wouldn't say anything and the function would actually
accept a DNAStringSet object without complaining and return a wrong
result.

I've fixed this in Biostrings realease (2.24.1) and devel (2.25.1):

 > seq <- DNAStringSet(c("AAAAA", "TTTTT"))
 > seq2 <- DNAStringSet(c("AAAAA", "TTTTTCC"))
 > oligonucleotideTransitions(seq)
   A C G T
A 4 0 0 0
C 0 0 0 0
G 0 0 0 0
T 0 0 0 4
 > oligonucleotideTransitions(seq2)
   A C G T
A 4 0 0 0
C 0 1 0 0
G 0 0 0 0
T 0 1 0 4
 > oligonucleotideTransitions(seq2, left=2)
    A C G T
AA 3 0 0 0
AC 0 0 0 0
AG 0 0 0 0
AT 0 0 0 0
CA 0 0 0 0
CC 0 0 0 0
CG 0 0 0 0
CT 0 0 0 0
GA 0 0 0 0
GC 0 0 0 0
GG 0 0 0 0
GT 0 0 0 0
TA 0 0 0 0
TC 0 1 0 0
TG 0 0 0 0
TT 0 1 0 3

Also note that now 'left' and 'right' are checked and must be >= 1.

 > oligonucleotideTransitions(seq2, left=0)
Error in oligonucleotideTransitions(seq2, left = 0) : 'left' must be >= 1

The updated versions of Biostrings should become available thru 
biocLite() in the next 24 hours or so.

Thanks for your feedback!
H.


On 03/31/2012 08:04 AM, Muino, Jose wrote:
> Dear all,
>
> I have the impression that the function oligonucleotideTransitions from the Biostrings package (version 2.9) has an unexpected behavior, it only analyzes the first sequence of the DNAStringSet variable used as input.
>
> Looking to the code of this function, I have the impression that it is just because it is using the function oligonucleotideFrequency without the parameter simplify.as set to value "collapse". Probably this parameter was added to the function oligonucleotideFrequency later than when oligonucleotideTransitions function was implemented.
>
> This is an example of what I call an unexpected behavior (I just set the parameter left=0 to simplify the result):
>> seq<-DNAStringSet(c("AAAAA","TTTTT"))
>> oligonucleotideTransitions(seq,left=0)
>       [,1] [,2] [,3] [,4]
> [1,]    5    0    0    0
>
> when the result should be
>       [,1] [,2] [,3] [,4]
> [1,]    5    0    0    5
>
>
> Should forward this message to the maintainer of the Biostring package? Which is his email?
>
> Thanks,
> Jose
> Dr. Jose M Muino
> Plant Research International B.V.
> Droevendaalsesteeg 1
> P.O. Box 16, 6700 AA Wageningen, The Netherlands
> Phone: +0317-481122.
> E-mail: jose.muino at wur.nl
> http://www.pri.wur.nl
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list