[BioC] total number of reads? mapped reads? or total counts?

Mark Robinson mark.robinson at imls.uzh.ch
Tue Dec 13 09:22:18 CET 2011


Hi Shan,

We typically use a fourth concept, the notion of 'effective' library size.

The idea is quite simple, and spelled out here:
http://genomebiology.com/2010/11/3/R25

And, the function is implemented in edgeR's calcNormFactors().

HTH,
Mark

On 12.12.2011, at 17:44, wang peter wrote:

> hello all
> 
> 
> In the edgeR package,
>  the lib.size: vector of length ncol(counts) giving the total number
> of reads sequenced
> for each sample. If not separately provided, will be set to colSums(counts).
> 
> but there are three different concepts.
> 
> usually the number of mapped reads < total reads < counts
> because not all of the reads can be mapped
> and one mapped reads have more than 1 hit.
> 
> so which one should be used in the NB model?
> 
> -- 
> shan gao
> Room 231(Dr.Fei lab)
> Boyce Thompson Institute
> Cornell University
> Tower Road, Ithaca, NY 14853-1801
> Office phone: 1-607-254-1267(day)
> Official email:sg839 at cornell.edu
> Facebook:http://www.facebook.com/profile.php?id=100001986532253
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland

v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y32-J-34
w: http://tiny.cc/mrobin



More information about the Bioconductor mailing list