[BioC] DNAString: Error for long sequences

Martin Morgan mtmorgan at fhcrc.org
Wed Aug 27 05:21:08 CEST 2008


Hi Ann --

Ann Hess <hess at stat.colostate.edu> writes:

> I am having trouble creating long (~1000+ characters)
> DNAStrings. Specifically, I seem to get an error when R has to go to a
> new line in the command window when copy/pasting the sequence from an
> R script (with no breaks).  I recently upgraded R and BioConductor,
> but previously I did not have this problem.

The problem is the inserted line feed embedded in the string

> DNAString("AAA
+ TTT")
Error in charToXRaw(x, start = start, end = end, width = width, lkup = lkup,  : 
  key 10 not in lookup table

as I guess you've figured out. I *think* (no windows access at the
moment) that you can

> dna <- readLines(file("clipboard"))

to read the clipboard without pasting into the console.  You could
also

> dna <- "AAA
+ TTT"
> dna
[1] "AAA\nTTT"
> DNAString(gsub("\n", "", dna))
  6-letter "DNAString" instance
seq: AAATTT

Martin

> Code and error is below.  Any suggestions?
>
> Ann
>
> **********************************************************
>
>> library(Biostrings)
>
> # "Short" example with no error
>> Temp1<-DNAString(
> + "AACACACGCATCTCACGCCGAGGACCTGGGATCG.....")
>> Temp1
>    969-letter "DNAString" instance
>
> # "Long" example, such that R needs a new line to complete the sequence
>> Temp2<-DNAString(
> + "AACACACGCATCTCACGCCGAGGACCTGGGATCG.....
> + TTTTTCGCATAGTGCTCAA")
> Error in charToXRaw(x, start = start, end = end, width = width, lkup =
> lkup,  :
>    key 10 not in lookup table
>
>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] Biostrings_2.8.17
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list