[BioC] shortread base quality

David martin vilanew at gmail.com
Mon Aug 27 15:17:56 CEST 2012


Thanks Martin, we are almost there.
Reads are not the same size (range from 14 to 30 nt) . Any other idea ?  
Maybe build an empty matrix and fill in the scores with NA values when 
reads are short ??


On 08/27/2012 03:07 PM, Martin Morgan wrote:
> On 08/27/2012 12:43 AM, David martin wrote:
>> Not really Martin,
>> It's the score for each nucleotide so the matrix should be something
>> like this
>>
>>      1 2 3 4 5 6 7 .. 30
>> [1,]    36 36 36 36 36 .. 36 #score for each nucleotide at each position
>> of the read.
>> [2,]    36 36 36 36 36 .. 36
>> [3,]    36 36 28 36 36 .. 36
>> [4,]    36 36 36 36 28 .. 36
>> [5,]    36 28 36 36 36 .. 36
>>
>> Any idea ?
>
> if the reads are all the same width (implied by the matrix above) then
> as(quality(rfq), "matrix"); sorry about the misleading info. Martin
>
>>
>> On 08/24/2012 06:30 PM, Martin Morgan wrote:
>>> On 08/24/2012 07:09 AM, David martin wrote:
>>>> Hi,
>>>> I'm trying to get the quality scores for each nucleotide for each read
>>>> from the fastq file.
>>>> Here is how it starts. I know to get average scores for reads but not
>>>> for each individual nucleotide of each read.
>>>>
>>>>
>>>>
>>>> file <- "file.fastq"
>>>> fqfile <- paste(basename(file),"",sep="")
>>>> path <- dirname(file)
>>>> sp <- SolexaPath(path,dataPath=path,analysisPath=path)
>>>> fq <- readFastq(sp, fqfile)
>>>>
>>>> #Get quality scores
>>>> score <- alphabetScore(fq)
>>>>
>>>> #Gives the sum of the base quality scores for each read
>>>> aveScore <- score / width(fq)
>>>
>>> try alphabetFrequency, e.g.,
>>>
>>>    ragged = narrow(quality(rfq), 1, width=c(20, 30)) ## recycle width
>>>    alf = alphabetFrequency(ragged)
>>>
>>>  > dim(alf)
>>> [1] 256  94
>>>  > alf[1:5, 1:5] # not very exciting at this end...
>>>         ! " # $
>>> [1,] 0 0 0 0 0
>>> [2,] 0 0 0 0 0
>>> [3,] 0 0 0 0 0
>>> [4,] 0 0 0 0 0
>>> [5,] 0 0 0 0 0
>>>
>>>
>>> Martin
>>>
>>>>
>>>> #How can i get the score for each base for each read ????
>>>>
>>>> thanks,
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list