[BioC] about the quality score

Martin Morgan mtmorgan at fhcrc.org
Thu Jan 12 14:59:30 CET 2012


On 01/11/2012 01:05 PM, wang peter wrote:
> dear martin:
>        the Illumina1.3+(Phred+64) is not Solexa score,
>
> YOU CAN SEE :
>
>
> Score Offset phred ASCII
>
> Sanger	33	0–93	33–126
> Solexa	64	-5–62	59–126
> Illumina1.3+	64	0–62	64–126
>
>
> if i use solexa function to deal with Illumina1.3+, is it compatible?

In ShortRead, FastqQuality and SFastqQuality determine the _encoding_; 
SFastqQuality is appropriate for Solexa and Illumina1.3+. Functions in 
ShortRead, e.g., alphabetScore() or as(quality(), "matrix") operate on 
the integer value of the corresponding letter. ShortRead does not 
(unless I am missing some code) translate the encoding into probabilities.

Biostrings PhredQuality and SolexaQuality also represent encoding, but 
allow coercion to numeric, as(<...>, "numeric"). These coercions use -10 
log10 (p) for PhredQuality, -10 log10(p / 1-p) for SolexaQuality. The 
latter is not appropriate for Illumina1.3+ (although the differences are 
most pronounced when p is large, i.e., when reads have low quality 
anyway). I will add an additional class IlluminaQuality, to Biostrings 
in the 'devel' branch.

Martin
-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list