[BioC] Typo in ?FastqQuality help page?

Martin Morgan mtmorgan at fhcrc.org
Mon May 24 00:31:45 CEST 2010


On 05/22/2010 06:39 PM, Peng Yu wrote:
> On Sat, May 22, 2010 at 5:44 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>> On 05/22/2010 03:38 PM, Peng Yu wrote:
>>> Hi Martin,
>>>
>>> '?FastqQuality' leads me to page with the first line 'QualityScore
>>>         package:ShortRead             R Documentation'
>>>
>>> Then I see,
>>>
>>>      Use these functions to construct quality indicators for reads or
>>>      alignments. See 'QualityScore' for details of object content and
>>>      methods available for manipulating them.
>>> ...
>>>      Constructors return objects of the corresponding class derived
>>>      from 'QualityScore'.
>>>
>>> ...
>>>      'QualityScore', 'readFastq', 'readAligned'
>>>
>>>
>>> However, when I query the helppage of QualityScore, I got nothing.
>>>
>>>> ?QualityScore
>>> No documentation for 'QualityScore' in specified packages and libraries:
>>> you could try '??QualityScore'
>>
>> ?"QualityScore-class"
> 
> I don't see where the explanation of the difference between
> FastqQuality and SFastqQuality is. Is the first one for Sanger the
> second one for Illumina according to the following webpage?
> 
> http://en.wikipedia.org/wiki/FASTQ_format
> 
> There are totally 4 different Phred score scheme. Would you please let
> which correspond to which class in ShortRead package?
> 
> S - Sanger        Phred+33,  raw reads typically (0, 40)
>  X - Solexa        Solexa+64, raw reads typically (-5, 40)
>  I - Illumina 1.3+ Phred+64,  raw reads typically (0, 40)
>  J - Illumina 1.5+ Phred+64,  raw reads typically (3, 40) with
> 0=unused, 1=unused, 2=Read Segment Quality Control Indicator (bold)

There are two different types of information here, quality score (phred
vs. solexa) and encoding (+33 vs +64). FastqQuality is +33 encoding,
SFastqQuality is +64 encoding. The classes are largely silent about the
underlying interpretation of the score as phred versus solexa quality.
Also ShortRead can arrive at the wrong representation, e.g., when
reading a fastq file which contains quality scores but no indication of
what scale those scores are read on or how they are encoded.

Martin


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list