[BioC] a program to force fastq using pred+33 scoring system

Martin Morgan mtmorgan at fhcrc.org
Mon May 19 19:23:25 CEST 2014


On 05/19/2014 06:49 AM, Wang Peter wrote:
> i am sorry, i have another question about the coding
>    phred64 = rawToChar(as.raw(64:104))
>    phred33 = rawToChar(as.raw(33:73))
> qual = chartr(phred64, phred33, quality(quality(reads)))
> how did it deal with quality score from 59 to 63?

I did not include them; expand the phred64 and phred33 strings to reflect the 
desired mapping.

Martin

>
>
> On Mon, May 19, 2014 at 9:32 PM, Wang Peter <wng.peter at gmail.com
> <mailto:wng.peter at gmail.com>> wrote:
>
>     thank you very much
>     you coding is so beautiful, mine is ugly
>
>
>     On Mon, May 19, 2014 at 8:03 PM, Martin Morgan <mtmorgan at fhcrc.org
>     <mailto:mtmorgan at fhcrc.org>> wrote:
>
>         On 05/14/2014 02:56 PM, Wang Peter wrote:
>
>             i worte a program to automatically dectect the scoring system in
>             fastq file
>             and if pred+64, it should be changed into pred+33
>
>
>         I would make the relevant alphabets
>
>            phred64 = rawToChar(as.raw(64:104))
>            phred33 = rawToChar(as.raw(33:73))
>
>         translate the character representation of the underling BStringSet
>
>            qual = chartr(phred64, phred33, quality(quality(reads)))
>
>         and tag the result as phred+33
>
>            ShortReadQ(sread(reads), FastqQuality(qual))
>
>         To get a numeric representation of the quality encoding, I would
>
>            as(quality(reads), "numeric")
>
>         or for a rectangular, NA-padded representation
>
>            as(quality(reads), "matrix")
>
>         Martin
>
>
>             please help me make sure it is right
>
>             rm(list=ls())
>             fastqfile="1.fastq";
>             working_dir="d:/working_dir"
>             setwd(working_dir)
>             library(ShortRead);
>             reads <- readFastq(fastqfile);
>             seqs <- sread(reads);
>             score_sys = data.class(quality(reads));
>             cat("the quality score system
>             (SFastqQuality=Phred+64,__FastqQuality=Phred+33) is",score_sys,"\n")
>             raw_len <- max(width(reads));
>
>             qual <- quality(quality(reads));
>             myqual_16L <- charToRaw(as.character(unlist(__qual)));
>             if(score_sys =="FastqQuality")
>             {
>                myqual_10L <- strtoi(myqual_16L,16L)-33;   # this is for other use
>             }
>
>             if(score_sys =="SFastqQuality")
>             {
>                myqual_10L <- strtoi(myqual_16L,16L)-64;
>                qual_temp <- PhredQuality (as.integer(myqual_10L));
>                qual <- BStringSet(unlist(qual_temp), start= seq(from = 1, to =
>             raw_len*(length(reads)-1)+1, by = raw_len), width=raw_len);
>
>             }
>             reads <- ShortReadQ(sread=seqs, quality=qual ) );
>
>
>
>         --
>         Computational Biology / Fred Hutchinson Cancer Research Center
>         1100 Fairview Ave. N.
>         PO Box 19024 Seattle, WA 98109
>
>         Location: Arnold Building M1 B861
>         Phone: (206) 667-2793 <tel:%28206%29%20667-2793>
>
>
>
>
>     --
>     shan gao
>     Room 231(Dr.Fei lab)
>     Boyce Thompson Institute for Plant Research
>     Cornell University
>     Tower Road, Ithaca, NY 14853-1801
>     Office phone: 1-607-254-1267 <tel:1-607-254-1267>(day)
>     Official email:sg839 at cornell.edu <mailto:email%3Asg839 at cornell.edu>
>     Facebook:http://www.facebook.com/profile.php?id=100001986532253
>
>
>
>
> --
> shan gao
> Room 231(Dr.Fei lab)
> Boyce Thompson Institute for Plant Research
> Cornell University
> Tower Road, Ithaca, NY 14853-1801
> Office phone: 1-607-254-1267(day)
> Official email:sg839 at cornell.edu <mailto:email%3Asg839 at cornell.edu>
> Facebook:http://www.facebook.com/profile.php?id=100001986532253


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list