[BioC] ShortRead - readAligned() with bowtie & qual

Kasper Daniel Hansen khansen at stat.berkeley.edu
Wed Jan 20 14:38:18 CET 2010


alignQuality is not the same as quality.

quality is the qualities of the reads (which you are interested in).  alignQuality is the quality if the _alignment_, which Bowtie does not give (one could say that a perfect match alignment is better than a 1 mismatch alignment and so on).  You should also have noticed that alignQuality is a vector of numeric, but that there is only one element per read, whereas the qualities have one element per read per base.

So you need to operate on quality(aln)

Kasper

On Jan 20, 2010, at 6:36 AM, Marc Noguera wrote:

> Dear list,
> I am trying to do some quality assessment on solexa runs using
> Bioc&shortreads.
> I am using bowtie as a mapper, which yields bowtie-formatted output with
> fastq scores for alignment, such as:
>> HWUSI-EAS621_91022_1_100_1938_1667 +   chr15   53573544   
>> CAGTCTCCCAAAGTACTGGGATAATAGGTGTGAGACTCC
>> DPYWYWYYWWWWPWWYWTVWWYWWWYYWYWXWBBBBBBB 0   34:C>A,36:A>T
>> HWUSI-EAS621_91022_1_100_1938_1823 -   chr18   34747447   
>> ACCCGGGAGTTGGGCTGCTTAGTGGCTGGACTCTCTTCC
>> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0   34:T>G
>> HWUSI-EAS621_91022_1_100_1938_608  +   chr19   35665132   
>> CAGCTGCTCAGGAGGCTGAGGCAGGAGAATCGCTTGAGC
>> DMTTTSRSTUTTTTTUTTTTTTTTTTQSSBBBBBBBBBB 2
>> HWUSI-EAS621_91022_1_100_1938_1207 +   chr22   30069585   
>> TCTGGGCCGTGGGGAGGCTCCTCCTTGGCTGATGGCGCC
>> DMTUTTRUTPTSTTUUUTSSTTUTBBBBBBBBBBBBBBB 0   35:T>C,37:A>C
>> HWUSI-EAS621_91022_1_100_1938_222  -   chr20   61020239   
>> GCCTGGGCCTCCCGAAGTGCTGTGGTTACAGGCATGAGC
>> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 2   25:A>G,34:C>G
>> HWUSI-EAS621_91022_1_100_1938_1562 +   chr15   84916971   
>> TGGGTTTCACCATGGTGGCCAGGCTGGTCTCAAACTCCT
>> DNUVUWWWWWWWWWWWWWUWWWWWWWVBBBBBBBBBBBB 0
>> HWUSI-EAS621_91022_1_100_1938_1290 -   chr9    120742911  
>> AGCCCAAGAGAGCCTTCTCCTCGACCATTACCACCAATG
>> BBBBBBBBBBBBBBBBSWRLPWUWRSWUKTWXXWXWWND 0   33:C>A,35:T>C
> When I try to read this file with the readAligned() function with:
>> aln <-
> readAligned("/path/",pattern="test.fastq.bwt",type="Bowtie",qualityType='FastqQuality')
> 
> I obtain an alignedread object, which includes quality data.
>> quality(aln)
>>> quality(aln)
>> class: SFastqQuality
>> quality:
>>  A BStringSet instance of length 3331015
>>          width seq
>>      [1]    35 BBB=B?:AA:@?@>?B@@AA@@A;>@4>>7922=>
>>      ...   ... ...
>> [3331015]    33 %%/<<<1;:<<:<<<<995<<<:<::<<<:<<<
> However, when I try to use this qualities to plot them I obtain "NA" values
>>> alignQuality(aln)
>> class: NumericQuality
>> quality: NA NA ... NA NA (3331015 total)
> So, I guess there is some kind of problem when transforming to ASCII to
> quality numerical values. I have also tried with SFastqQuality type to
> read the input, with no succes.
> 
> What am I doing wrong?
> 
> thanks in advance
> Marc
> 
> -- 
> 
> -----------------------------------------------------
> Marc Noguera i Julian, PhD
> Genomics unit / Bioinformatics
> Institut de Medicina Preventiva i Personalitzada
> del Càncer (IMPPC)
> B-10 Office
> Carretera de Can Ruti
> Camí de les Escoles s/n
> 08916 Badalona, Barcelona
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list