[BioC] Count differences between sequences

Patrick Aboyoun paboyoun at fhcrc.org
Thu Mar 25 22:45:29 CET 2010


Erik,
Could you provide more details on your data? How long are each of the strings and how many strings do you have? Also, do you need the entire N x N distance matrix for downstream analysis or are you just looking for closest relatives?


Patrick



On 3/25/10 2:29 PM, erikwright at comcast.net wrote:
> Hello all,
>
>
> I have a large DNAStringSet and I am trying to calculate its distance matrix. My DNAStrings are equal width and they are already aligned.
>
>
> I have tried using the stringDist() function, but it is very slow for large DNAStringSets. Is there a way to quickly calculate the number of differences between two DNAString instances?
>
>
> For example, let's say I have two DNAStrings: "ACAC" and "ACAG". I would like to know if their is a function other than stringDist() that will tell me the distance between them is 1.
>
>
> Thanks in advance for any help.
>
>
> - Erik
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list