[BioC] edgeR: a question about library size

Mark Robinson mrobinson at wehi.EDU.AU
Thu Jun 17 14:15:11 CEST 2010


Hi Raffaele.

In my experience, you're better off with the number of mapped reads.  But, a safer way is to do something data-driven.  For example, TMM normalization (http://genomebiology.com/2010/11/3/R25) is implemented in the calcNormFactors() function.  See also the docs and the user's guide.

Hope that helps.

Cheers,
Mark

On 2010-06-17, at 10:00 PM, rcaloger wrote:

> Hi,
> I am using edgeR to detect differential expression in NGS experiments.
> I have a brief question on what I should considered as "total size of my 
> libraries".
> In my case I have a set of samples that have a quite large variation  in 
> the library size:
> 
> Total reads Mapped reads
> 
> 1 11076283 8736308
> 
> 2 5881045 4006468
> 
> 3 7139703 5108608
> 
> 4 9089153 5643701
> 
> 5 9723103 8457914
> 
> 6 15570265 8706332
> 
> 7 15844448 12056310
> 
> 8 13375681 8663496
> 
> 9 14997114 8799752
> 
> 10 15744584 8555922
> 
> 11 4642056 3201515
> 
> 12 6458028 4277204
> 
> 13 13206724 9466118
> 
> 14 3035032 2148730
> 
> 
> Should I insert as lib.size parameter the values referring to the real 
> size of the libraries (Total reads) or
> simply the size of the mapped reads (Mapped reads)
> 
> Thanks for the help
> Raffaele
> 
> -- 
> 
> ----------------------------------------
> Prof. Raffaele A. Calogero
> Bioinformatics and Genomics Unit
> Dipartimento di Scienze Cliniche e Biologiche
> c/o Az. Ospedaliera S. Luigi
> Regione Gonzole 10, Orbassano
> 10043 Torino
> tel.   ++39 0116705417
> Lab.   ++39 0116705408
> Fax    ++39 0119038639
> Mobile ++39 3333827080
> email: raffaele.calogero at unito.it
>        raffaele[dot]calogero[at]gmail[dot]com
> www:   http://www.bioinformatica.unito.it
> Info: http://publicationslist.org/raffaele.calogero
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robinson at garvan.org.au
e: mrobinson at wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------






______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list