[BioC] consensusString Function

Wolfgang Huber whuber at embl.de
Tue Apr 6 23:36:50 CEST 2010


Hi Erik, Herv'e

please always provide the output of sessionInfo(), and a complete 
reproducible example (you let Heidi and the others guess that you're 
talking about the Biostrings package).

With a more recent version of Biostrings, I get:

library("Biostrings")
consensusString( DNAStringSet(c("AAAA","ACTG")) )
# [1] "AMWR"

consensusString( DNAStringSet(c("AAAB","ACTG")) )
# Error in FUN(newX[, i], ...) :
   'ambiguityMap' is missing some combinations of row names

And going into the debugger where the error is caused, i.e. within the 
function "consensusLetter", the expression

i <- paste(all_letters[col >= threshold], collapse = "")

is evaluated where
Browse[1]> all_letters
[1] "A" "C" "G" "T" "B"    ## length is 5
Browse[1]> col
[1] 1 0 0 0                ## length is 4
Browse[1]> i
[1] "AB"                   ## recycling rule was applied

which seems unintended and with some more insight will probably lead to 
the fixing of a bug.

	Best wishes
	Wolfgang


 > sessionInfo()
R version 2.12.0 Under development (unstable) (2010-04-06 r51617)
x86_64-unknown-linux-gnu

locale:
  [1] LC_CTYPE=C              LC_NUMERIC=C            LC_TIME=C 

  [4] LC_COLLATE=C            LC_MONETARY=C 
LC_MESSAGES=it_IT.UTF-8
  [7] LC_PAPER=C              LC_NAME=C               LC_ADDRESS=C 

[10] LC_TELEPHONE=C          LC_MEASUREMENT=C        LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] Biostrings_2.15.26 IRanges_1.5.74     fortunes_1.3-7

loaded via a namespace (and not attached):
[1] Biobase_2.7.5

Heidi Dvinge ha scritto:
> Hello Erik,
> 
> unfortunately I'm not familiar with the Biostrings package, so I can't
> tell you why this doesn't work, but until someone else can answer, this
> seems to be a work-around.
> 
> Apparently, consensusString doesn't handle Ns.
> 
>> test <- DNAStringSet(c("AANN","ACTG"))
>> consensusString(test)
> Error in .local(x, ...) :
>   'threshold' must be a numeric in (0, 1/sum(rowSums(x) > 0)]
> 
> If there are no Ns things are okay though.
> 
>> test2 <- DNAStringSet(c("AAAA","ACTG"))
>> consensusString(test2)
> [1] "AMWR"
> 
> However, Ns seem acceptable if the consensus matrix is calculated first,
> although they might result in ?s where no consensus could be found.
> 
>> test3	<- consensusMatrix(test)
>> consensusString(test3)
> [1] "A???"
> 
> HTH
> \Heidi
> 
> 
>> Hello,
>>
>> I am trying to get a consensus string for a DNAStringSet, but I am getting
>> an error.  The documentation for consensusString says the argument "x" is
>> either a consensus matrix or an XStringSet.  So this should work, right?:
>>> myDNAStringSet <- DNAStringSet(c("NNNN","ACTG"))
>>> consensusString(myDNAStringSet)
>> Error in .local(x, ...) :
>>   'threshold' must be a numeric in (0, 1/sum(rowSums(x) > 0)]
>>
>> Specifying a threshold in the arguments doesn't seem to make a difference.
>>
>> Thanks!,
>> Erik
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 

Best wishes
      Wolfgang


--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber/contact



More information about the Bioconductor mailing list