[BioC] ShortRead Per Cycle Score

mtmorgan at fhcrc.org mtmorgan at fhcrc.org
Tue Apr 27 21:26:06 CEST 2010


Hi Dario --

Quoting Dario Strbenac <D.Strbenac at garvan.org.au>:

> Hello again,
>
> I'm now stuck on recreating the per cycle scores from the HTML report graph.
>
> When I do the plot, I get the typical decline to the right and the   
> scores go from about 32 at the left to 22 at the right.
>
> QAaligned <- qa("myPath/s_6_1", "uniq.map.gz", "Bowtie", qualityType  
>  = "SFastqQuality")
> pcq <- QAaligned[["perCycle"]][["quality"]]
> ShortRead:::.plotCycleQuality(pcq)
>
> But then when I calculate the scores by myself I get a very different trend.
>
> sbCyc <- split(pcq, pcq$Cycle)
>
> avgScores <- sapply(sbCyc, function(cycTable){
>   totalCounts <- sum(cycTable[, "Count"])
>   totalScore = 0
>   for(index in 1:nrow(cycTable))
>   {
>     totalScore = totalScore + (cycTable[index, "Score"] *   
> cycTable[index, "Count"])
>   }
>   avg <- totalScore / totalCounts
>   return(avg)

or just

   with(cycTable, sum(Score * Count) / sum(Count))

The underlying problem is that Score was not being calculated  
correctly (report() did not use Score when generating the figure);  
this has been corrected in both release and devel branches, along with  
an issue Nicholas Delhomme pointed out about transposed labeling of  
'aligned' versus 'filtered' reads for type=SolexaExport, version *.*.1.

Sorry for the confusion.

Martin

> })
>> avgScores
>         1         2         3         4         5         6           
> 7         8         9        10        11        12        13         
>  14        15        16        17
>  6.663533  7.743573  7.894290  7.752292  7.736450  7.667488    
> 7.602862  7.589947  7.452226  7.576258  7.454091  7.204792  7.396363  
>   7.282941  8.065390 11.513014 12.371052
>        18        19        20        21        22        23          
> 24        25        26        27        28        29        30        
>   31        32        33        34
> 12.073634 11.947487 12.039822 11.890664 12.277044 12.662244   
> 12.558223 13.296835 12.045379 12.147517 12.305975 10.955502   
> 11.982833 11.913569 11.072581 10.821556 12.029892
>        35        36
> 14.325464 13.316201
>
> It increases, and ranges from 6 to 13. I can't see a typo, so am I   
> missing some sort of mathematical transformation somewhere ?
>
> Thanks.
>
> --------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:   
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list