[BioC] EdgeR questions in analyzing 454 data-about prior.n, TMM, and p_value

Gordon K Smyth smyth at wehi.EDU.AU
Wed Oct 20 00:42:46 CEST 2010

Dear Ying Ye,

Just adding to one of Mark's comment, see below.

> Date: Tue, 19 Oct 2010 09:43:10 +1100
> From: Mark Robinson <mrobinson at wehi.EDU.AU>
> To: Ying Ye <mikecrux at gmail.com>
> Cc: Bioc-sig-sequencing at r-project.org, bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] [Bioc-sig-seq] EdgeR questions in analyzing 454
> 	data-about prior.n, TMM, and p_value
> Hi Ying.
> Some comments below.
> On 2010-10-18, at 10:22 PM, Ying Ye wrote:
>> Dear edgeR users and developers?
>> I have few questions about edgeR when recently I use it for 454 
>> pyrosequencing data:
>> 1. prior.n
>>     According to users' manual, we may not use too low prior.n in 
>> moderated tagwise dispersion approach. But in my dataset, there are 
>> more than 15 samples in each comparison group and the freedom is larger 
>> than 30. prior.n <- estimateSmoothing(d) gives 0.0005329. So I am 
>> wondering if I could use 0.0005329 since I have rather big number of 
>> samples in each group. Or I should adjust prior.n into 10 according to 
>> the manual's suggestion.
> Well, its hard to give a prescription for prior.n for all datasets. 
> Since you have so many degrees of freedom, you shouldn't need prior.n as 
> high as 10.  You might try something lower, say 1-3.

Just to refine this, how many degrees of freedom do you have per tag? 
Let's define df = number of libraries - number of groups.  I would suggest 
you choose your prior.n so that prior.n * df is around 50, but don't go 
below prior.n=1.

We are not recommending estimateSmoothing() at the moment because it gives 
variable results on next-generation sequencing data.  The 
estimateSmoothing() value for your data is too small to be recommended.

Best wishes

The information in this email is confidential and intend...{{dropped:4}}

More information about the Bioconductor mailing list