[BioC] edgeR uneven group sizes

Gordon K Smyth smyth at wehi.EDU.AU
Fri Jul 5 01:49:11 CEST 2013


Dear Charles,

There is no requirement in edgeR for equal group sizes, and never has 
been.  I am puzzled why you might think there is such an assumption. 
edgeR always allows you to use all the available data that is 
scientifically meaningful.

You say that you read "the initial posting that lead to this section of 
the manual and it said to drop the samples that don't have equal numbers" 
but I do not know what you are refering to.  I have never seen such 
advice.

Best wishes
Gordon

> Date: Wed, 3 Jul 2013 09:49:30 -0500
> From: Charles Determan Jr <deter088 at umn.edu>
> To: Bioconductor mailing list <bioconductor at r-project.org>
> Subject: [BioC] edgeR uneven group sizes
>
> Hello,
>
> I recently had a question regarding repeated measures RNA-seq analysis. 
> This has been thoroughly answered through an extension of the edgeR 
> manual section 3.5. However this has lead to me towards another question 
> as I attempted to extend such concepts to another experiment wherein the 
> sample size in each group is different.  For example, here is a 
> dataframe modified from the edgeR user manual concerning between and 
> within subjects comparisons (Section 3.5) and another containing 
> specific times points to explain my point, both dataframes re-numbered 
> as recommended by the manual.
>
>> targets
>    Disease Patient Treatment
> 1   Healthy    1        None
> 2   Healthy    1        Hormone
> 3   Healthy    2        None
> 4   Healthy    2        Hormone
> 5   Healthy    3        None
> 6   Healthy    3        Hormone
> 7   Disease1  1       None
> 8   Disease1  1       Hormone
> 9   Disease1  2       None
> 10 Disease1  2       Hormone
> 11 Disease2  1       None
> 12 Disease2  1       Hormone
> 13 Disease2  2       None
> 14 Disease2  2       Hormone
> 15 Disease2  3       None
> 16 Disease2  3       Hormone
>
>> sample_data
>    Condition Subject Time
> 1   control    1        0hr
> 2   control    1        1hr
> 3   control    1        2hr
> 4   control    2        0hr
> 5   control    2        1hr
> 6   control    2        2hr
> 7   control    3        0hr
> 8   control    3        1hr
> 9   control    3        2hr
> 10 control    4        0hr
> 11 control    4        1hr
> 12 control    4        2hr
> 13 Disease  1        0hr
> 14 Disease  1        1hr
> 15 Disease  1        2hr
> 16 Disease  2        0hr
> 17 Disease  2        1hr
> 18 Disease  2        2hr
>
> I have read the initial posting that lead to this section of the manual 
> and it said to drop the samples that don't have equal numbers.  Now this 
> doesn't seem to be a big deal if only dropping from one group a sample 
> or two but could potentially be a problem such as above where dropping 
> four or six samples seems more of a sacrifice.  I begin to think of 
> experiments which (assuming repeated/dependent samples) group numbers 
> very more significantly as a result of difficulty acquiring samples. 
> Are there any recommendations from the community regarding such a 
> situation?  All I have found assumes that the samples within each group 
> are equal.
>
> Regards,
> -- 
> Charles Determan
> Integrated Biosciences PhD Candidate
> University of Minnesota


______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list